Methods of compositions for diagnosing and treating chromosome-18p related disorders

ABSTRACT

The present invention relates to the mammalian HKNG1 gene, a gene associated with bipolar affective disorder (BAD) in humans. The invention relates, in particular, to methods for the diagnostic evaluation, genetic testing and prognosis of HKNG1 neuropsychiatric disorders including schizophrenia, attention deficit disorder, a schizoaffective disorder, a bipolar affective disorder or a unipolar affective disorder.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is

[0002] 1) a continuation-in-part U.S. application Ser. No. 09/631,275,filed Aug. 2, 2000, which is a continuation-in-part of U.S. applicationSer. No. 09/268,992, filed on Mar. 16, 1999, which is acontinuation-in-part of U.S. application Ser. No. 09/236,134, filed onJan. 22, 1999, which application claims the benefit of U.S. provisionalapplication ser. No. 60/078,044, filed on Mar. 16, 1998; of provisionalapplication No. 60/088,312, filed on Jun. 5, 1998; and of provisionalapplication No. 60/106,056 filed on Oct. 28, 1998,

[0003] and

[0004] 2) a continuation-in-part of U.S. application Ser. No.09/722,544, filed Nov. 28, 2000, which is a continuation-in-part of U.S.application Ser. No. 09/236,134, filed Jan. 22, 1999, which applicationclaims the benefit of U.S. provisional application serial No.60/078,044, filed on Mar. 16, 1998; of provisional application No.60/088,312, filed on Jun. 5, 1998; and of provisional application No.60/106,056 filed on Oct. 28, 1998,

[0005] each of which applications in 1) and 2) is incorporated herein byreference in its entirety.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

[0006] This invention was made with government support under grantnumbers R01MH49499, K02MH01375, K01MH01748-01, MH00916, MH49499,MH48695, and MH47563 by the National Institutes of Health. Thegovernment has certain rights in the invention.

1. INTRODUCTION

[0007] The present invention relates, first, to a gene referred toherein as the HKNG1 gene and shown herein to be associated with centralnervous system-related disorders, e.g., neuropsychiatric disorders, inparticular, bipolar affective disorder and schizophrenia and withmyopia-related disorders. The invention also relates to a gene forthymidylate synthase which is referred to herein as TS. The codingstrand of TS is demonstrated herein to be located on the long arm ofchromosome 18 and overlapping the coding strand of HKNG1. Thus, the geneTS is also within a region associated with central nervoussystem-related disorders, including, but not limited to,neuropsychiatric disorders, in particular, bipolar affective disorderand schizophrenia.

[0008] The invention includes recombinant DNA molecules and cloningvectors comprising sequences of the HKNG1 and/or the TS genes, and hostcells and non-human host organisms engineered to contain such DNAmolecules and cloning vectors. The present invention further relates toHKNG1 gene products, and to antibodies directed against such HKNG1 geneproducts. The present invention still further relates to TS geneproducts, and to antibodies directed against such TS gene products. Thepresent invention also relates to methods of using the HKNG1 gene andHKNG1 gene product, to methods of using the TS gene and TS gene product,including drug screening assays, and diagnostic and therapeutic methodsfor the treatment of HKNG1- and/or TS-mediated disorders, includingneuropsychiatric disorders such as bipolar affective disorder, as wellas myopia disorders such as early-onset autosomal dominant myopia.

2. BACKGROUND OF THE INVENTION

[0009] There are only a few psychiatric disorders in which clinicalmanifestations of the disorder can be correlated with demonstrabledefects in the structure and/or function of the nervous system.Well-known examples of such disorders include Huntington's disease,which can be traced to a mutation in a single gene and in which neuronsin the striatum degenerate, and Parkinson's disease, in whichdopaminergic neurons in the nigro-striatal pathway degenerate. The vastmajority of psychiatric disorders, however, presumably involve subtleand/or undetectable changes, at the cellular and/or molecular levels, innervous system structure and function. This lack of detectableneurological defects distinguishes “neuropsychiatric” disorders, such asschizophrenia, attention deficit disorders, schizoaffective disorder,bipolar affective disorders, or unipolar affective disorder, fromneurological disorders, in which anatomical or biochemical pathologiesare manifest. Hence, identification of the causative defects and theneuropathologies of neuropsychiatric disorders are needed in order toenable clinicians to evaluate and prescribe appropriate courses oftreatment to cure or ameliorate the symptoms of these disorders.

[0010] One of the most prevalent and potentially devastating ofneuropsychiatric disorders is bipolar affective disorder (BAD), alsoknown as bipolar mood disorder (BP) or manic-depressive illness, whichis characterized by episodes of elevated mood (mania) and depression(Goodwin, et al., 1990, Manic Depressive Illness, Oxford UniversityPress, New York). The most severe and clinically distinctive forms ofBAD are BP-I (severe bipolar affective (mood) disorder), which affects2-3 million people in the United States, and SAD-M (schizoaffectivedisorder manic type). They are characterized by at least one fullepisode of mania, with or without episodes of major depression (definedby lowered mood, or depression, with associated disturbances in rhythmicbehaviors such as sleeping, eating, and sexual activity). BP-I oftenco-segregates in families with more etiologically heterogeneoussyndromes, such as with a unipolar affective disorder such as unipolarmajor depressive disorder (MDD), which is a more broadly definedphenotype (Freimer and Reus, 1992, in The Molecular and Genetic Basis ofNeurological Disease, Rosenberg, et al., eds., Butterworths, New York,pp. 951-965; McInnes and Freimer, 1995, Curr. Opin. Genet. Develop., 5,376-381). BP-I and SAD-M are severe mood disorders that are frequentlydifficult to distinguish from one another on a cross-sectional basis,follow similar clinical courses, and segregate together in familystudies (Rosenthal, et al., 1980, Arch. General Psychiat. 37, 804-810;Levinson and Levitt, 1987, Am. J. Psychiat. 144, 415-426; Goodwin, etal., 1990, Manic Depressive Illness, Oxford University Press, New York).Hence, methods for distinguishing neuropsychiatric disorders such asthese are needed in order to effectively diagnose and treat afflictedindividuals.

[0011] Currently, individuals are typically evaluated for BAD using thecriteria set forth in the most current version of the AmericanPsychiatric Association's Diagnostic and Statistical Manual of MentalDisorders (DSM). While many drugs have been used to treat individualsdiagnosed with BAD, including lithium salts, carbamazepine and valproicacid, none of the currently available drugs are adequate. For example,drug treatments are effective in only approximately 60-70% ofindividuals diagnosed with BP-I. Moreover, it is currently impossible topredict which drug treatments will be effective in, for example,particular BP-I affected individuals. Commonly, upon diagnosis, affectedindividuals are prescribed one drug after another until one is found tobe effective. Early prescription of an effective drug treatment,therefore, is critical for several reasons, including the avoidance ofextremely dangerous manic episodes, the risk of progressivedeterioration if effective treatments are not found, and the risk ofsubstantial side effects of current treatments.

[0012] The existence of a genetic component for BAD is stronglysupported by segregation analyses and twin studies (Bertelson, et al.,1977, Br. J. Psychiat. 130, 330-351; Freimer and Reus, 1992, in TheMolecular and Genetic Basis of Neurological Disease, Rosenberg, et al.,eds., Butterworths, New York, pp. 951-965; Pauls, et al., 1992, Arch.Gen. Psychiat. 49, 703-708). Efforts to identify the chromosomallocation of genes that might be involved in BP-I, however, have yieldeddisappointing results in that reports of linkage between BP-I andmarkers on chromosomes X and 11 could not be independently replicatednor confirmed in the re-analyses of the original pedigrees, indicatingthat with BAD linkage studies, even extremely high lod scores at asingle locus, can be false positives (Baron, et al., 1987, Nature 326,289-292; Egeland, et al., 1987, Nature 325, 783-787; Kelsoe, et al.,1989, Nature 342, 238-243; Baron, et al., 1993, Nature Genet. 3, 49-55).

[0013] Recent investigations have suggested possible localization of BADgenes on chromosomes 18p and 21q, but in both cases the proposedcandidate region is not well defined and no unequivocal support existsfor either location (Berrettini, et al., 1994, Proc. Natl. Acad. Sci.USA 91, 5918-5921; Murray, et al., 1994, Science 265, 2049-2054; Pauls,et al., 1995, Am. J. Hum. Genet. 57, 636-643; Maier, et al., 1995,Psych. Res. 59, 7-15; Straub, et al., 1994, Nature Genet. 8, 291-296).

[0014] Mapping genes for common diseases believed to be caused bymultiple genes, such as BAD, may be complicated by the typicallyimprecise definition of phenotypes, by etiologic heterogeneity, and byuncertainty about the mode of genetic transmission of the disease trait.With neuropsychiatric disorders there is even greater ambiguity indistinguishing individuals who likely carry an affected genotype fromthose who are genetically unaffected. For example, one can define anaffected phenotype for BAD by including one or more of the broadgrouping of diagnostic classifications that constitute the mooddisorders: BP-I, SAD-M, MDD, and bipolar affective (mood) disorder withhypomania and major depression (BP-II).

[0015] Thus, one of the greatest difficulties facing psychiatricgeneticists is uncertainty regarding the validity of phenotypedesignations, since clinical diagnoses are based solely on clinicalobservation and subjective reports. Also, with complex traits such asneuropsychiatric disorders, it is difficult to genetically map thetrait-causing genes because: (1) neuropsychiatric disorder phenotypes donot exhibit classic Mendelian recessive or dominant inheritance patternsattributable to a single genetic locus; (2) there may be incompletepenetrance, i.e., individuals who inherit a predisposing allele may notmanifest disease; (3) a phenocopy phenomenon may occur, i.e.,individuals who do not inherit a predisposing allele may neverthelessdevelop disease due to environmental or random causes; and (4) geneticheterogeneity may exist, in which case mutations in any one of severalgenes may result in identical phenotypes.

[0016] Despite these difficulties, however, identification of thechromosomal location, sequence and function of genes and gene productsresponsible for causing neuropsychiatric disorders such as bipolaraffective disorders is of great importance for genetic counseling,diagnosis and treatment of individuals in affected families.

3. SUMMARY OF THE INVENTION

[0017] The present invention relates, first, to the discovery,identification, and characterization of novel nucleic acid moleculesthat are associated with central nervous sytem (“CNS”) related disordersand processes including, but not limited to, human neuropsychiatricdisorders such as schizophrenia, attention deficit disorder,schizoaffective disorder, dysthymic disorder, major depressive disorder,and bipolar affective disorder (“BAD”); including, e.g., severe bipolaraffective (i.e., mood) disorder (i.e., BP-I), and bipolar affective(i.e., mood) disorder with hypomania and major depression (i.e., BP-II).The invention also relates to the discovery, identification andcharacterization of proteins encoded by such nucleic acid molecules, orby degenerate (i.e., allelic or homologous) variants thereof, or byorthologs (i.e., variants of the nucleic acid molecules that areexpressed in other species) thereof. The invention still further relatesto the discovery, identification and characterization of novel nucleicacid molecules that are associated with human myopia or nearsightedness,such as early-onset, autosomal dominant myopia as well as to thediscovery, identification and characterization of proteins encoded bysuch nucleic acid molecules or by degenerate variants thereof.

[0018] The nucleic acid molecules of the present invention represent,first, nucleic acid sequences corresponding to a gene, or fragmentsthereof, referred to herein as HKNG1. As demonstrated in the Examplespresented hereinbelow in Sections 6-8, 14 and 18, the HKNG1 gene isassociated with human CNS-related disorders, e.g., neuropsychiatricdisorders, in particular BAD. The HKNG1 gene is associated with otherhuman neuropsychiatric disorders as well including, for example,schizophrenia. Further, as demonstrated in the Example presented inSection 14, the HKNG1 gene is also associated with human myopia, such asearly-onset autosomal dominant myopia.

[0019] The nucleic acid molecules of the present invention alsorepresent nucleic acid sequences corresponding to a second gene, orfragment thereof, referred to herein as TS. In particular, and asdemonstrated in the example presented in Section 21, the codingsequences of TS are located on the short arm of chromosome 18q. Thus, TSis also within a region of human chromosome 18 associated with humanCNS-related disorders such as neuropsychiatric disorders, in particularBAD, as well as other human neuropsychiatric disorders such asschizophrenia.

[0020] The invention is based, in part, on the discovery of a narrow, 27kb interval on the short arm of human chromosome 18, which is associatedwith and therefore contains a gene or genes associated with, theneuropsychiatric disorder BAD. The invention is also based on thediscovery that this 27 kb interval lies within the HKNG1 gene,demonstrating that the HKNG1 gene is a gene associated withneuropsychiatric disorders such as BAD. The invention is further basedon the discovery of novel HKNG1 cDNA sequences. In particular, thediscovery of such cDNA sequences, which is also described hereinbelow inSection 7, has led to the elucidation of the HKNG1 genomic (that is,upstream untranslated, intron/exon and downstream untranslated)structure and to the discovery of full-length and alternately splicedHKNG1 variants as well as the elucidation of novel proteins encoded bysuch variants. These experiments are described in Sections 7, 10 and 18,below. The discovery of such cDNA sequences has also led to theelucidation of novel mammalian (e.g., guinea pig, bovine and rat) HKNG1sequences, and also to the discovery of novel allelic variants andpolymorphisms of such sequences, as described in Sections 10, 19, and20, below.

[0021] The invention encompasses nucleic acid molecules which comprisethe following nucleotide sequences: (a) nucleotide sequences (e.g., SEQID NOs: 1, 3, 5-7, 36-37 and 65) that comprise a human HKNG1 gene and/orencode a human HKNG1 gene product (e.g., SEQ ID NOs: 2 and 4), as wellas allelic variants, homologs and orthologs thereof, includingnucleotide sequences (e.g., SEQ ID NOs: 38, 40, 42, 44, 46-48, 109, 111,113, 116 and 119) that encode non-human HKNG1 gene products (e.g., SEQID NOs: 39, 41, 43, 45, 49 110, 112, 114, 117, 118 and 120); (b)nucleotide sequences comprising the novel HKNG1 sequences disclosedherein that encode mutants of the HKNG1 gene product in which sequencesencoding all or a part of one or more of the HKNG1 domains is deleted oraltered, or fragments thereof; (c) nucleotide sequences that encodefusion proteins comprising an HKNG1 gene product (e.g., SEQ ID NO: 2 and4), or a portion thereof fused to a heterologous polypeptide; and (d)nucleotide sequences within the HKNG1 gene, as well as chromosome 18pnucleotide sequences flanking the HKNG1 gene or located on the strandopposite the coding strand of the HKNG1 gene, which can be utilized,e.g., as primers, in the methods of the invention for identifying anddiagnosing individuals at risk for or exhibiting an HKNG1-mediateddisorder, such as BAD or schizophrenia, or for diagnosing individuals atrisk for or exhibiting a form of myopia such as early-onset autosomaldominant myopia. The nucleic acid molecules of (a) through (d), above,can include, but are not limited to, cDNA, genomic DNA, and RNAsequences.

[0022] The invention further encompasses nucleic acid molecules whichcomprise: (i) nucleotide sequences (e.g., SEQ ID NO:140) that comprise aTS gene (including a human TS gene) and/or encode a TS gene product(e.g., a human TS gene product), as well as allelic variants, homologsand orthologs thereof; (j) nucleotide sequences comprising one or morepolymorphisms of the TS nucleotide sequence, including the polymorphismsdescribed herein; (k) nucleotide sequences corresponding to fragments ofa TS gene (e.g., fragments of SEQ ID NO: 140) that are at least 71, 73,101, 137, 174, or 175 nucleotides in length or, alternatively,corresponding to fragments of a TS gene that are at least 204nucleotides in length; and (l) nucleotide sequences within the TS gene,including chromosome 1 8p nucleotide sequences flanking or opposite theTS gene, which can be utilized, e.g., as primers in the methods of theinvention for identifying and diagnosing individuals at risk for orexhibiting a TS-mediated disorder, such as BAD or schizophrenia. Thenucleic acid molecules of (i) through (l), above, can include, but arenot limited to, cDNA, genomic DNA, and RNA sequences.

[0023] The invention also encompasses the expression products of thenucleic acid molecules listed above; i.e., peptides, proteins,glycoproteins and/or polypeptides that are encoded by the HKNG1 and/orTS nucleic acid molecules of (a) through (l), above.

[0024] The compositions of the present invention further encompassagonists and antagonists of the HKNG1 and TS gene products, includingsmall molecules (such as small organic molecules), and macromolecules(including antibodies), as well as nucleotide sequences that can be usedto inhibit HKNG1 and/or TS gene expression (e.g., antisense and ribozymemolecules, and gene or regulatory sequence replacement constructs) or toenhance HKNG1 and/or TS gene expression (e.g., expression constructsthat place the HKNG1 gene and/or the TS gene under the control of astrong promoter system).

[0025] The compositions of the present invention further include cloningvectors and expression vectors containing the nucleic acid molecules ofthe invention, as well as hosts which have been transformed with suchnucleic acid molecules, including cells genetically engineered tocontain the nucleic acid molecules of the invention, and/or cellsgenetically engineered to express the nucleic acid molecules of theinvention. In addition to host cells and cell lines, hosts also includetransgenic non-human animals (or progeny thereof), particularlynon-human mammals, that have been engineered to express an HKNG1transgene, “knock-outs” that have been engineered to not express HKNG1,transgenic non-human animals (or progeny thereof), transgenic non-humananimals (or progeny thereof) particularly non-human mammals (e.g., miceor rats), that have been engineered to express a TS transgene,“knock-outs” that have been engineered to not express TS.

[0026] Transgenic non-human animals of the invention include animalsengineered to express an HKNG1 or a TS transgene at higher or lowerlevels than normal, wild-type animals. The transgenic animals of theinvention also include animals engineered to express a mutant variant orpolymorphism of an HKNG1 or TS transgene which is associated with HKNG1-or TS-mediated disorder, for example neuropsychiatric disorders, such asBAD and schizophrenia, or, alternatively, a myopia disorder such asearly-onset autosomal dominant myopia. The transgenic animals of theinvention further include the progeny of such genetically engineeredanimals.

[0027] The invention further relates to methods for the treatment ofHKNG1-mediated, and/or TS-mediated disorders in a subject, such asHKNG1 - and/or TS-mediated neuropsychiatric disorders as well as myopiadisorders mediated by HKNG1 wherein such methods comprise administeringa compound which modulates the expression of a HKNG1 (or TS) gene and/orthe synthesis or activity of a HKNG1 (or TS) gene product so symptoms ofthe disorder are ameliorated.

[0028] The invention further relates to methods for the treatment ofdisorders mediated by HKNG1, or TS in a subject, such asneuropsychiatric disorders and myopia disorders, that are mediated byHKNG1, or TS e.g., resulting from HKNG1, or TS gene mutations oraberrant levels of HKNG I, or TS expression or activity. Such methodscomprise supplying the subject with a nucleic acid molecule encoding anunimpaired HKNG1, or TS gene product such that an unimpaired HKNG1, orTS gene product is expressed and symptoms of the disorder areameliorated.

[0029] The invention further relates to methods for the treatment ofdisorders in a subject, neuropsychiatric disorders and myopia disordersmediated by HKNG1, or TS, resulting from gene mutations or from aberrantlevels of expression or activity of the gene HKNG1, or TS, wherein suchmethods comprise supplying the subject with a cell comprising a nucleicacid molecule that encodes an unimpaired HKNG1, or TS gene product suchthat the cell expresses the unimpaired HKNG1, or TS gene product andsymptoms of the disorder are ameliorated.

[0030] The invention also encompasses pharmaceutical formulations andmethods for treating disorders, including neuropsychiatric disorders,such as BAD and schizophrenia, and myopia disorders, such as early-onsetautosomal dominant myopia, involving the HKNG1, or TS gene.

[0031] Further, the present invention is directed to methods thatutilize the HKNG1 nucleic acid sequences, nucleic acid sequences,chromosome 18p nucleotide sequences flanking the HKNG1 gene, TS nucleicacid sequences, HKNG1 gene product sequences, and/or TS gene productsequences for mapping the chromosome 18p region, and for the diagnosticevaluation, genetic testing and prognosis of a HKNG1- or a TS-mediateddisorder, such as neuropsychiatric disorder or a myopia disorder. Forexample, in one embodiment, the invention relates to methods fordiagnosing HKNG1-mediated disorders, wherein such methods comprisemeasuring HKNG1 gene expression in a patient sample, or detecting aHKNG1 polymorphism or mutation in the genome of a mammal, including ahuman, suspected of exhibiting such a disorder. In one embodiment,nucleic acid molecules encoding HKNG1 can be used as diagnostichybridization probes or as primers for diagnostic PCR analysis for theidentification of HKNG1 gene mutations, allelic variations andregulatory defects in the HKNG1 gene which correlate withneuropsychiatric disorders such as BAD or schizophrenia.

[0032] In another exemplary embodiment, the invention relates to methodsfor diagnosing TS-mediated disorders, wherein such methods comprisemeasuring TS gene expression in a patient sample or detecting a TSpolymorphism or mutation in the genome of a mammal, including a human,suspected of exhibiting such as disorder. In one embodiment, nucleicacid molecules encoding TS can be used as diagnostic hybridizationprobes or as primers for diagnostic PCR analysis for the identificationof TS gene mutations, allelic variations and regulatory defects in theTS gene which correlate with a TS-mediated disorder such as aneuropsychiatric disorder (e.g., BAD or schizophrenia).

[0033] The invention still further relates to methods for identifyingcompounds which modulate the expression of the HKNG1 gene and/or thesynthesis or activity of the HKNG1 gene products. Such methods canidentify therapeutic compounds, which reduce or eliminate the symptomsof HKNG1-mediated disorders, including HKNG1-mediated neuropsychiatricdisorders such as BAD and schizophrenia, and/or compounds that can betested for an ability to act as therapeutic compounds. Further, theinvention also relates to methods for identifying compounds whichmodulate the expression of the TS gene and/or the synthesis or activityof a TS gene product. Such methods can identify therapeutic compounds,which reduce or eliminate symptoms of TS-mediated disorders, includingTS-mediated neuropsychiatric disorders such as BAD and schizophreniaand/or compounds that can be tested for an ability to act as therapeuticcompounds.

[0034] Among such methods are animal, cellular and non-cellular assaysthat can be used to identify compounds that interact with a HKNG1 geneproduct or with a TS gene product, such as compounds which modulate theactivity (e.g., level of gene expression, level of gene product, and/orbiochemical activity of the gene product) of an HKNG1 gene productand/or bind to the HKNG1 gene product, or compounds which modulate theactivity of a TS gene product and/or bind to the TS gene product. In thecase of animal or cell-based assays of the invention, such assaystypically utilize animals (e.g., transgenic animals), cells, cell lines,or engineered cells or cell lines that express the HKNG1, or the TS geneproduct.

[0035] In one embodiment, such methods comprise contacting a compoundwith a cell that expresses a HKNG1 gene, measuring the level of HKNG1gene expression, gene product expression or gene product biochemicalactivity, and comparing this level to the level of HKNG1 geneexpression, gene product expression or gene product biochemical activityproduced by the cell in the absence of the compound, such that if thelevel obtained in the presence of the compound differs from thatobtained in its absence, a compound that modulates the expression of theHKNG1 gene and/or the synthesis or activity of the HKNG1 gene productshas been identified.

[0036] In another embodiment, such methods comprise contacting acompound with a cell that expresses a HKNG1 gene and also comprises areporter construct whose transcription is dependent, at least in part,on HKNG1 expression or activity. In such an embodiment, the level ofreporter transcription is measured and compared to the level of reportertranscription in the cell in the absence of the compound. If the levelof reporter transcription obtained in the presence of the compounddiffers from that obtained in its absence, a compound that modulatesexpression of HKNG1 or genes involved in HKNG1-related pathways orsignal transduction has been identified.

[0037] In yet another embodiment, such methods comprise administering acompound with a host, such as a transgenic animal, that expresses anHKNG1 transgene or a mutant HKNG1 transgene associated with anHKNG1-mediated disorder such as a neuropsychiatric disorder (e.g., BADor schizophrenia), or to an animal, e.g., a knock-out animal, that doesnot express HKNG1, and measuring the level of HKNG1 gene expression,gene product expression, gene product activity, or symptoms of anHKNG1-mediated disorder such as an HKNG1-mediated neuropsychiatricdisorder (e.g., BAD or schizophrenia). The measured level is compared tothe level obtained in a host that is not exposed to the compound, suchthat if the level obtained when the host is exposed to the compounddiffers from that obtained in a host not exposed to the compound, acompound modulates the expression of the mammalian HKNG1 gene and/or thesynthesis or activity of the mammalian HKNG1 gene products, and/or thesymptoms of an HKNG1-mediated disorder such as a neuropsychiatricdisorder (e.g., BAD or schizophrenia), has been identified.

[0038] Similar methods utilize a TS nucleic acid and/or gene product.Thus, in one embodiment, the methods comprise contacting a compound witha cell that expresses a TS gene, measuring the level of TS geneexpression, gene product expression or gene product activity, andcomparing this level to the levels of TS gene expression, gene productexpression or gene product activity produced by the cell in the absenceof the compound such that if the level obtained in the presence of thecompound differs from that obtained in its absence a compound thatmodulates the expression of the TS gene and/or the synthesis or activityof the TS gene product has been identified.

[0039] In another embodiment, such methods comprise contacting acompound with a cell that expresses a TS gene and also comprises areporter construct whose transcription is dependent, at least in part,on TS expression or activity. In such an embodiment, the level ofreporter transcription is measured and compared to the level of reportertranscription in the cell in the absence of the compound. If the levelof reporter transcription obtained in the presence of the compounddiffers from that obtained in its absence, a compound that modulatesexpression of TS or genes involved in TS-related pathways or signaltransduction has been identified.

[0040] In yet another embodiment, such methods comprise administering acompound to a host, such as a transgenic animal, that expresses a TStransgene or a mutant TS transgene associated with a TS-mediateddisorder such as a neuropsychiatric disorder (e.g., BAD orschizophrenia) or to an animal (e.g., a knock-out animal) that does notexpress TS, and measuring the level of TS gene expression, gene productexpression, gene product activity or symptoms of an TS-mediated disorder(e.g., a TS-mediated neuropsychiatric disorder such as BAD orschizophrenia). The measured level is compared to the level obtained ina host that is not exposed to the compound, such that if the levelobtained when the host is exposed to the compound differs from thatobtained in a host not exposed to the compound, a compound modulates theexpression of the mammalian TS gene and/or the synthesis or activity ofa mammalian TS gene product, and/or the symptoms of a TS mediateddisorder (e.g., a neuropsychiatric disorder such as BAD orschizophrenia) has been identified.

[0041] The present invention still further relates to pharmacogenomicand pharmacogenetic methods for selecting an effective drug toadminister to an individual having a HKNG1-mediated disorder. Suchmethods are based on the detection of genetic polymorphisms in the HKNG1gene or variations in HKNG1 gene expression due to, e.g., alteredmethylation, differential splicing, or post-translational modificationof the HKNG1 gene product which can affect the safety and efficacy of atherapeutic agent. The invention still also relates to pharmacogenomicand pharmacogenetic methods for selecting an effective drug toadminister to an individual having a TS-mediated disorder. Such methodsare based on the detection of genetic polymorphisms in the TS gene orvariations in TS gene expression due, e.g., to altered methylation,differential splicing, or post-translational modification of the TS geneproduct which can affect the safety and efficacy of a therapeutic agent.As used herein, the following terms shall have the abbreviationsindicated.

[0042] BAC, bacterial artificial chromosomes

[0043] BAD, bipolar affective disorder(s)

[0044] BP, bipolar mood disorder

[0045] BP-I, severe bipolar affective (mood) disorder

[0046] BP-II, bipolar affective (mood) disorder with hypomania and majordepression bp, base pair(s)

[0047] EST, expressed sequence tag

[0048] HKNG1, Hong Kong new gene 1

[0049] lod, logarithm of odds

[0050] MDD, unipolar major depressive disorder

[0051] MHC, major histocompatibility complex

[0052] ROS, reactive oxygen species

[0053] RT-PCR, reverse transcriptase PCR

[0054] SSCP, single-stranded conformational polymorphism

[0055] SAD-M, schizoaffective disorder manic type

[0056] STS, sequence tagged site

[0057] TS, thymidylate synthase

[0058] YAC, yeast artificial chromosome

[0059] “HKNG1-mediated, GNKH-mediated and/or TS-mediated disorders”include disorders involving an aberrant level of HKNG1, GNKH and/or TSgene expression, gene product synthesis and/or gene product activityrelative to levels found in clinically normal individuals, and/orrelative to levels found in a population whose level represents abaseline, average HKNG1, GNKH and/or TS level. While not wishing to bebound by any particular mechanism, it is to be understood that disordersymptoms can, for example, be caused, either directly or indirectly, bysuch aberrant levels. Alternatively, it is to be understood that suchaberrant levels can, either directly or indirectly, ameliorate disordersymptoms, (e.g., as in instances wherein aberrant levels of HKNG1, GNKHand/or TS suppress the disorder symptoms caused by mutations within asecond gene).

[0060] HKNG1-mediated, GNKH-mediated and/or TS-mediated disordersinclude, for example, central nervous system (CNS) disorders. CNSdisorders include, but are not limited to cognitive andneurodegenerative disorders such as Alzheimer's disease, seniledementia, Huntington's disease, amyotrophic lateral sclerosis, andParkinson's disease, as well as Gilles de la Tourette's syndrome,autonomic function disorders such as hypertension and sleep disorders,and neuropsychiatric disorders that include, but are not limited toschizophrenia, schizoaffective disorder, attention deficit disorder,dysthymic disorder, major depressive disorder, mania,obsessive-compulsive disorder, psychoactive substance use disorders,anxiety, panic disorder, as well as bipolar affective disorder, e.g.,severe bipolar affective (mood) disorder (BP-I), bipolar affective(mood) disorder with hypomania and major depression (BP-II). FurtherCNS-related disorders include, for example, those listed in the AmericanPsychiatric Association's Diagnostic and Statistical manual of MentalDisorders (DSM), the most current version of which is incorporatedherein by reference in its entirety.

[0061] “HKNG1-mediated, GNKH-mediated and/or TS-mediated processes”include processes dependent and/or responsive, either directly orindirectly, to levels of HKNG1, GNKH and/or TS gene expression, geneproduct synthesis and/or gene product activity. Such processes caninclude, but are not limited to, developmental, cognitive and autonomicneural and neurological processes, such as, for example, pain, appetite,long term memory and short term memory.

[0062] Nucleotide sequences, including cDNA sequences, genomic DNAsequences as well as RNA sequences, e.g., for oligonucleotides,nucleotide probes and nucleotide primers are depicted herein, unlessotherwise noted, in the 5′ to 3′ direction and according to the singleletter nucleic acid code as follows: A Adenine C Cytosine G Guanine TThymine U Uracil R either Adenine or Guanine Y either Cytosine orThymine K either Guanine or Thymine M either Adenine or Cytosine Seither Cytosine or Guanine W either Adenine or Thymine B any base exceptAdenine D any base except Cytosine H any base except Guanine V any baseexcept Thymine N any base (i.e. Adenine, Cytosine, Guanine or Thymine)is permitted

[0063] Polypeptide and other amino acid sequences, including full lengthand partial peptide, polypeptide and protein sequences, are depictedherein, unless otherwise noted, in the carboxy- to amino-terminaldirection and according to either the one letter or three letter aminoacid code as follows: A Ala Alanine C Cys Cysteine D Asp Aspartic acid EGlu Glutamic acid F Phe Phenylalanine G Gly Glycine H His Histidine IIle Isoleucine K Lys Lysine L Leu Leucine M Met Methionine N AsnAsparagine P Pro Proline Q Gln Glutamine R Arg Arginine S Ser Serine TThr Threonine V Val Valine W Trp Tryptophan Y Tyr Tyrosine

4. BRIEF DESCRIPTION OF THE FIGURES

[0064]FIGS. 1-1C. Nucleotide sequence (SEQ ID NO: 1) of human HKNG1 cDNA(bottom line); derived amino acid sequence (SEQ ID NO: 2) of its derivedpolypeptide (top line). The nucleotide sequence encoding SEQ ID NO:2corresponds to SEQ ID NO:5.

[0065]FIGS. 2A-2C. Nucleotide sequence (SEQ ID NO: 3) of an alternatelyspliced human HKNG1 variant, referred to as HKNG1-V1, (bottom line); andthe derived amino acid sequence (SEQ ID NO: 4) of its polypeptide (topline). The nucleotide sequence encoding SEQ ID NO:4 corresponds to SEQID NO:6

[0066]FIGS. 3A-0 to 3A-28. The genomic sequence (SEQ ID NO: 7) of thehuman HKNG1 gene. The exons are indicated by underlined bold face type;the 3′ and 5′ UTRs (untranslated regions) are double-underlined.

[0067]FIGS. 4A and 4B. A summary of in situ hybridization analysis ofHKNG1 mRNA distribution in normal human brain tissue.

[0068]FIGS. 5A-5C. HKNG1 polymorphisms relative to the HKNG1 wild-typesequence. These polymorphisms were isolated from a collection ofschizophrenic patients of mixed ethnicity from the United States (FIG.5A-5B) and from the San Francisco BAD collection (FIG. 5C).

[0069] FIGS. 6A-B. The nucleotide sequences of the RT-PCR products forHKNG1-V2 (FIG. 6A; SEQ ID NO:36) and HKNG1-V3 (FIG. 6B; SEQ ID NO:37).

[0070]FIGS. 7A-7C. The cDNA sequence (SEQ ID NO:38) and the predictedamino acid sequence (SEQ ID NO:39) of the guinea pig HKNG1 orthologgphkng1815.

[0071]FIGS. 8A-8C. The cDNA sequence (SEQ ID NO:40) and the predictedamino acid sequence (SEQ ID NO:41) of gphkng 7b, an allelic variant ofthe guinea pig HKNG1 ortholog gphkng1815.

[0072]FIGS. 9A-9C. The cDNA sequence (SEQ ID NO:42) and the predictedamino acid sequence (SEQ ID NO:43) of gphkng 7c, an allelic variant ofthe guinea pig HKNG1 ortholog gphkng1815.

[0073]FIGS. 10A-10C. The cDNA sequence (SEQ ID NO:44)and the predictedamino acid sequence (SEQ ID NO:45) of gphkng 7d, an allelic variant ofthe guinea pig HKNG1 ortholog gphkng1815.

[0074]FIGS. 11A-11C. The cDNA sequence (SEQ ID NO:46) and the predictedamino acid sequence (SEQ ID NO:49) of the allelic variant bhkng1 of thebovine HKNG1 ortholog.

[0075]FIGS. 12A-12D. The cDNA sequence (SEQ ID NO:47) and the predictedamino acid sequence (SEQ ID NO:49) of the allelic variant bhkng2 of thebovine HKNG1 homologue.

[0076]FIGS. 13A-13C. The cDNA sequence (SEQ ID NO:48) and the predictedamino acid sequence (SEQ ID NO:49) of the allelic variant bhkng3 of thebovine HKNG1 homologue.

[0077]FIGS. 14A-14M. Alignments of the guinea pig HKNG1 cDNA sequence(FIGS. 14A-14L) and the predicted amino acid sequences (FIG. 14M) forgphkng1815 (SEQ ID NOS:38 (cDNA) and 39 (amino acid)), gphkng7b (SEQ IDNOS:40 (cDNA) and 41 (amino acid)), gphkng7c (SEQ ID NOS:42 (cDNA) and43 (amino acids)), and gphkng 7d (SEQ ID NOS:44 (cDNA) and 45 (aminoacid). The “Majority” sequence for the cDNAs is provided in FIGS.14A-14L (SEQ ID NO:165).

[0078]FIGS. 15A-15F. Alignments of the cDNA sequences of the bovineHKNG1 allelic variants bhkng1, bhkng2, and bhkng3 (SEQ ID NO:46, SEQ IDNO:47 and SEQ ID NO:48)

[0079]FIG. 16. Alignments of the amino acid sequences of human(hkng_aa), bovine (bhkng_aa) and guinea pig (gphkng1815_aa) HKNG1cDNA.(SEQ ID NO:131, SEQ ID NO:49 and SEQ ID NO:39).

[0080]FIGS. 17A and 17B. Alignments of human HKNG1 protein sequences;top line: the mature secreted HKNG1 protein sequence (SEQ ID NO:51);bottom line: immature HKNG1 protein form 3 (IPF3; SEQ ID NO:4).; thirdline: immature HKNG1 protein form 2 (IPF2; SEQ ID NO:64); second line:immature HKNG1 protein form 1 (IPF1; SEQ ID NO:2).

[0081]FIGS. 18A-18C. The nucleotide sequence (SEQ ID NO: 65) of humanHKNG1 splice variant HKNG1Δ7 cDNA (bottom line) and the predicted fulllength amino acid sequence (SEQ ID NO: 66) of its derived polypeptide(top line).

[0082]FIG. 19. The genomic organization of HKNG1 gene. The arrows denotepositions of the markers used in genetic linkage analysis withassociated p values. The box shows region spanning exon 11 with highestevidence for genetic linkage.

[0083]FIGS. 20A-20D. A schematic representation of various 3 ′-splicevariants of human HKNG1 identified by RT-PCR; FIG. 20A shows a schematicrepresentation of the exon structure at the 3′-end of the full lengthsplice variant depicted in FIG. 1-1C (SEQ ID NO:1). Three additionalsplice variants were also identified: a splice variant, referred to asHKNG1Δ10, the exon structure of which is shown in FIG. 20B; a splicevariant, referred to as “HKNG1+intron10,” the exon structure of which isshown in FIG. 20C; and a splice variant referred to as “HKNG1Δ10+210,”the exon structure of which is shown in FIG. 20D

[0084] FIGS. 21A, 21B-1, and 21B-2. The partial nucleotide sequence(FIG. 21A; SEQ ID NO:121) of the human HKNG1 3′-splice variant HKNG1Δ10(SEQ ID NO:121), and the predicted HKNG1Δ10 gene product (FIGS. 21B-1and 21B-2; SEQ ID NO: 159).

[0085]FIG. 22. The partial nucleotide sequence (SEQ ID NO:122) of humanHKNG1 3′-splice variant HKNG1 intron 10 cDNA.

[0086] FIGS. 23A-C. The partial nucleotide sequence (SEQ ID NO:123) ofhuman HKNG1 3′-splice variant HKNG1+10′, and the predicted HKNG1+10′gene product (FIGS. 23B and 23C; SEQ ID NO:133).

[0087]FIG. 24. A schematic representation of ESTs found to contig withHKNG1 gene. The ESTs are labeled with their Genbank accession numbers.

[0088]FIG. 25. A schematic representation of contigs (GNKH, contig 1;HKNG1, contig 2) derived by EST datamining.

[0089]FIG. 26. The additional 565 bases of downstream sequence which iscontiguous with the previously identified HKNG1 sequence(SEQ ID NO:73).This downstream sequence was derived by DNA sequencing of H81803. Thebases that were not available from the Genbank database are highlighted.The bases underlined are divergent from the genomic sequence of theidentified HKNG1 sequence.

[0090]FIG. 27. A schematic representation of ESTs that contribute to theGNKH contig. The ESTs are labeled with their Genbank accession numbers.

[0091]FIG. 28. The nucleotide sequence of GNKH cDNA (SEQ ID NO: 74).

[0092]FIG. 29. A schematic alignment of HKNG1/TS genomic DNA to GNKHcDNA. GNKH is depicted in the 3′-5′ orientation to highlight itsrelationship to HKNG1 and TS. AAAA signifies the presence of a polyAtail The size of the 2 GNKH putative exons is given, as is the size ofthe regions of GNKH which overlap with HKNG1 and TS exon sequence.

[0093]FIGS. 30A-30B. An alignment of GNKH (GNKHEXP) to HKNG1 genomic DNAfragment. The genomic sequence of GNKH (SEQ ID NO: 124) is depicted inthe 5′-3′ orientation to highlight its relationship to HKNG1 (SEQ IDNO:160) and TS.

[0094]FIG. 31. A schematic diagram of the relationship of HKNG1, TS,GNKH and rTS genes. The last exon of HKNG1, and the first and last exonof TS are represented as boxes, separated by intron sequences (solidline). GNKH and rTS are represented as boxes (exons) separated byspliced out introns (solid lines) with approximate intron sizes shown.Dashed lines represent the 13 kb intervening genomic sequence which liesbetween GNKH and rTS. AAA represents predicted polyadenylation sites.

[0095]FIG. 32. The predicted amino acid sequence (SEQ ID NO:75) of GNKHOpen Reading Frame a (ORFa) encoded by GNKH bases 383-754.

[0096]FIG. 33. The predicted amino acid sequence (SEQ ID NO:76) of GNKHOpen Reading Frame b (ORFb) encoded by GNKH bases 510-845.

[0097]FIG. 34. The nucleotide sequence of partial rat HKNG1 cDNA (SEQ IDNO:109) and the predicted amino acid sequence (SEQ ID NO:110) of thederived rat HKNG1 polypeptide encoded thereby.

[0098]FIG. 35. The amino acid alignment of human (SEQ ID NO:161), bovine(SEQ ID NO: 162), guinea pig (SEQ ID NO:163), and rat (SEQ ID NO:164)HKNG1 cDNA. Lower case letters represent amino acids encoded by primersand upper case letters represent the amplified amino acids encoded byPCR product.

[0099] FIGS. 36A-B. The nucleotide sequence of a partial rat HKNG1 cDNA(FIG. 36A, SEQ ID NO:111) isolated by 3′ RACE, and the predicted aminoacid sequence for the partial rat HKNG1 gene product (FIG. 36B, SEQ IDNO:112) it encodes.

[0100] FIGS. 37A-B. The sequence of larger partial rat HKNG1 cDNA (FIG.37A, SEQ ID NO:113) that corresponds to regions encoding the carboxyterminus of a rat HKNG1 gene product (FIG. 37B, SEQ ID NO:114).

[0101] FIGS. 38A-C. The sequence of the published EST identified byGenBank Accession No. AI715798 (FIG. 38A, SEQ ID NO:115), itscomplementary sequence (FIG. 38B, SEQ ID NO:116), and a predictedpolypeptide sequence (FIG. 38C, SEQ ID NO:117) encoded by thecomplementary sequence.

[0102] FIGS. 39A, 39B-1, and 39B-2. The nucleotide sequence of a cDNA(FIG. 39A, SEQ ID NO:119) encoding a full length rat HKNG1 gene product(FIGS. 39B-1 and 39B-2, SEQ ID NO:120).

[0103] FIGS. 40A, 40B-1, and 40B-2. The nucleotide sequence of a ratHKNG1 cDNA (FIG. 40A, SEQ ID NO:134) encoding a full length rat HKNG1 Tvariant gene product (FIGS. 40B-1 and 40B-2, SEQ ID NO:135).

[0104] FIGS. 41A, 41B-1, and 41B-2. The nucleotide sequence of a ratHKNG1 cDNA (FIG. 41A, SEQ ID NO:136) encoding a full length rat HKNG1 Cvariant gene product (FIGS. 41B-1 and 41B-2, SEQ ID NO:137).

[0105] FIGS. 42A-B. The nucleotide sequence of a rat HKNG1 cDNA (FIG.42A, SEQ ID NO:138) encoding a rat HKNG1 delta 9-splice variant geneproduct (FIG. 42B, SEQ ID NO:139).

[0106]FIGS. 43A and 43B. The amino acid alignment of human (SEQ IDNO:64), bovine (SEQ ID NO:49), guinea pig (SEQ ID NO:45), and rat HKNG1T variant (SEQ ID NO:135), rat HKNG1 delta 9 variant Cdna (SEQ IDNO:139), and rat HKNG1 C variant (SEQ ID NO:137).

[0107] FIGS. 44A-G. The genomic sequence (SEQ ID NO:140) of the human TSgene. The exons are indicated by underlined bold face type; the 3′ and5′ UTRs (untranslated regions) are double-underlined.

[0108] FIGS. 45A-B. The nucleotide sequence of a human TS cDNA (FIG.45A, SEQ ID NO:141) encoding a human TS gene product (FIG. 45B, SEQ IDNO:142).

[0109]FIG. 46. Hydropathy plot of human TS protein. Relativelyhydrophobic residues are above the horizontal line, and relativelyhydrophilic residues are below the horizontal line.

[0110] FIGS. 47A-C. Pedigree CR001 with the ID numbers of individualscorresponding to those in the columns of Table 15. All haplotypes werereconstructed by hand. Bracketed alleles indicate that assignment ofphase cannot be certain. RC indicates that the haplotypes for thesepersons were reconstructed as no sample was available for genotyping. A? indicates data missing.

[0111]FIG. 48. Map of the genes contained in the 300 kb BP-I candidateinterval on 18p11.3. The vertical lines indicate the location of theSNPs giving evidence for association to BP-I including (from left toright, or telomere to centromere) PH33, PH84, PH205, PH202, PH208, TS16,and TS30.

5. DETAILED DESCRIPTION OF THE INVENTION 5.1. CHROMOSOME 18P NUCLEICACID MOLECULES

[0112] This section describes, in detail, the nucleic acid molecules ofthe present invention. In particular, the nucleic acid molecules of agene which is referred to herein as “HKNG1” or the “HKNG1 gene” aredescribed herein. The discovery and characterization of the human HKNG1gene, including the genomic sequence of the HKNG1 gene and severalsplice variants and polymorphisms, are described in the Examplespresented in Sections 6-9, below. The isolation and characterization ofcertain exemplary orthologs of the HKNG1 gene in other species (i.e.,bovine, guinea pig and rat) is also described in the examples presented,below, in Sections 10 and 19. Further, vectors encoding fusion proteinsof the HKNG1 gene product, which are also, therefore, considered to beamong the HKNG1 gene sequences of the invention, are described in theExample presented, below, in Section 11.

[0113] The nucleic acid molecules of a second novel gene are alsodescribed in this Section. Specifically, this section also describes thenucleic acid molecules of a gene which is referred to herein as GNKH.The isolation and characterization of the GNKH gene and its nucleic acidsequences, including certain exemplary polymorphisms of the GNKH nucleicacid sequences, is described, below, in the Examples presented inSections 16 and 17.

[0114] The nucleic acid molecules of a known gene are also described inthis Section. Specifically, this section also describes the nucleic acidmolecules of a gene encoding thymidylate synthase which is referred toherein as TS. The characterization of the TS and its nucleic acidsequences, including certain exemplary polymorphisms of the TS nucleicacid sequences, is described, below, in the Example presented in Section21.

5.1.1. THE HKNG1 GENE

[0115] Unless otherwise stated, the term “HKNG1 nucleic acid” or “HKNG1gene” is understood to refer collectively to those sequences describedin this subsection as well as to allelic variants and polymorphisms ofthose sequences such as the allelic variants and polymorphismsdescribed, below, in Section 5.1.3. In particular, the genomic structureof the human HKNG1 gene has been elucidated and is depicted in FIGS.3A-1-3A-28 and in SEQ ID NO:7. The intronic structure of the human HKNG1gene has also been elucidated and is also disclosed in FIGS. 3A-1-3A-28.In particular, the exon sequences of the human HKNG1 gene are depictedin bold-faced type In FIGS. 3A-1-3A-28. The exons of the human HKNG1gene are also depicted, schematically, in FIG. 29.

[0116] A human HKNG1 cDNA sequence (SEQ ID NO:1) encoding the fulllength amino acid sequence (SEQ ID NO:2) of the HKNG1 polypeptide isdepicted in FIGS. 1A-C. This human HKNG1 gene encodes a secretedpolypeptide of 495 amino acid residues, as shown in FIGS. 1A-C and inSEQ ID NO:2. The nucleotide sequence of the portion of this full lengthhuman HKNG1 cDNA corresponding to the open reading frame (“ORF”)encoding this HKNG1 gene product is depicted as SEQ ID NO:5.

[0117] The HKNG1 sequences of the invention also include splice variantsof the HKNG1 sequences described herein. For example, an alternativelyspliced human HKNG1 cDNA sequence, referred to herein as HKNG1-V1 (SEQID NO:3) is shown in FIGS. 2A-C along with the amino acid sequence (SEQID NO:4) of the human HKNG1 variant gene product (i.e., the HKNG1-V1gene product) it encodes. This splice variant of the human HKNG1 geneencodes a secreted polypeptide of 477 amino acid residues, as shown inFIGS. 2A-C and in SEQ ID NO:4. The nucleotide sequence of the portion ofthe HKNG1-V1 cDNA corresponding to the open reading frame encoding theHKNG1-V1 gene product is depicted in SEQ ID NO:6.

[0118] Another alternatively spliced human HKNG1 cDNA sequence (SEQ IDNO:65), referred to herein as HKNG1Δ7 (SEQ ID NO:65) is shown in FIGS.18A-C, along with the amino acid sequence (SEQ ID NO:66) of the humanHKNG1 variant gene product (i.e., the HKNG1Δ7 gene product) it encodes.

[0119] Other alternatively spliced HKNG1 cDNA sequences are alsoprovided herein. In particular, another alternatively spliced HKNG1 cDNAsequence, referred to herein as HKNG1-V2 (SEQ ID NO:36), is described inthe example presented in Section 9, below. This alternatively splicedhuman HKNG1 cDNA sequence contains a new exon, referred to herein asExon 2′ (SEQ ID NO:34). Yet another alternatively spliced HKNG1 cDNAsequence, referred to herein as HKNG1-V3 (SEQ ID NO:37), is alsodescribed in the example presented in Section 9. This alternativelyspliced human HKNG1 cDNA sequence contains a new exon, referred toherein as Exon 2″ (SEQ ID NO:35). Both of these exons (i.e., Exon 2′ andExon 2″) are part of the 5′-untranslated region of the HKNG1 cDNA. Thus,the splice variants HKNG1-V2 and HKNG1-V3 encode HKNG1 polypeptidesidentical to the full length HKNG1 polypeptide depicted in FIGS. 1A-C(SEQ ID NO:2).

[0120] 3′-splice variants of the human HKNG1 gene are also disclosedherein, in Section 9. Specifically, the partial sequence of a splicevariant that lacks Exon 10 of the HKNG1 genomic sequence, and which istherefore referred to herein as HKNG1Δ10 is depicted in FIG. 21A (SEQ IDNO:121). This splice variant is therefore predicted to encode a HKNG1gene product which does not contain amino acid sequences encoded by Exon10 of the HKNG1 genomic sequence. In particular, the predicted geneproduct encoded by HKNG1Δ10 (SEQ ID NO:131), which is depicted in FIGS.21B-1 and 21B-2, comprises the sequence of amino acid residues 1-428 ofthe full length HKNG1 gene product shown in FIGS. 1A-C (SEQ ID NO:2)followed by the novel carboxy-terminal sequence “RRSNASYIQ” (SEQ IDNO:132).

[0121] The partial sequence of another alternatively spliced human HKNG1gene sequence, referred to herein as “HKNG1+intron10” (SEQ ID NO:122) isdepicted in FIG. 22. The HKNG1+intron10 splice variant comprises, inaddition to the nucleotide sequences of Exon 10, an additional 125 basesof nucleotide sequence corresponding to Intron 10 (i.e., the intronflanked by Exons 10 and 11 of the HKNG1 genomic sequence). However,because the additional sequences of this splice variant are within thepredicted 5′-untranslated region of the HKNG1+intron 10 cDNA sequence,the predicted gene product of this splice variant is, in fact, identicalto the full length HKNG1 gene product shown in FIGS. 1A-C (SEQ ID NO:2).

[0122] The partial sequence of yet another alternatively spliced humanHKNG1 gene sequence, referred to herein as “HKNG1+10′” is shown in FIG.23A (SEQ ID NO:123). The nucleotide sequence of this splice variantcomprises ah additional 159 nucleotides corresponding to a novel Exon,referred to herein as Exon 10′, located between Exons 10 and 11 of theHKNG1 genomic sequence shown in FIGS. 3A-1-3A-28. The predictedHKNG1+10′ gene product, which is depicted in FIG. 23B (SEQ ID NO:133) isidentical to the first 494 amino acid residues of the full length HKNG1gene product shown in FIGS. 1A-C (SEQ ID NO:2), but does not include thefinal tryptophan amino acid residue at position 495 of the full lengthHKNG1 gene product sequence.

[0123] Exemplary, non-human homologs or orthologs, e.g., of the humanHKNG1 sequences described above are also provided. Specifically, aguinea pig cDNA sequence (SEQ ID NO:38) referred to herein asgphkng1815, encoding the full length amino acid sequence (SEQ ID NO:39)of a guinea pig HKNG1 ortholog, is shown in FIGS. 7A-7C. This guinea pigcDNA sequence encodes a gene product of 466 amino acid residues, whichis also shown in FIGS. 7A-7C and in SEQ ID NO:39.

[0124] Allelic variants of this guinea pig HKNG1 ortholog, referred toas gphkng7b, gphkng7c, and gphkng7d (SEQ ID NOs:40, 42 and 44,respectively) are also provided herein, in FIGS. 8A-8C, 9A-9C and10A-10C, respectively. The gene products encoded by each of these guineapig HKNG1 sequences are also depicted in FIGS. 13A-15F, respectively,and in SEQ ID NOs: 41, 43, and 45, respectively. The allelic variantsgphkng7b, gphkng7c and gphkng7d each encode variants of the guinea piggphkng1815 HKNG1 gene product which contain deletions of 16, 92 and 93amino acid residues, respectively, as shown in the sequence alignmentdepicted in FIG. 14A-M.

[0125] Bovine HKNG1 ortholog cDNA sequences (SEQ ID NOs:46-48), referredto herein as bhkng1, bhkng2 and bhkng3, are also provided herein, inFIGS. 11A-11C, 12A-12D and 13A-13C, respectively. Each of these bovineHKNG1 ortholog sequences encodes the same bovine ortholog gene product;i.e., a polypeptide of 465 amino acid residues (SEQ ID NO:49), as shownin FIGS. 16-18. A rat HKNG1 ortholog cDNA sequence (SEQ ID NO:119) isprovided in FIGS. 39A-B, along with the rat ortholog HKNG1 gene productit encodes (SEQ ID NO:120). Further, partial rat HKNG1 cDNA sequences(SEQ ID NOs:109, 111, 113 and 116) are also provided along with theirpredicted amino acid sequences (SEQ ID NOs:110, 112, 114, 117 and 118).Alignments of the human, guinea pig, bovine and rat ortholog HKNG1 geneproducts is depicted in FIG. 35.

[0126] The nucleic acid molecules of the present invention thereforeinclude the following HKNG1 nucleic acid molecules: (a) nucleotidesequences, and fragments thereof, that encode a HKNG1 gene product or afragment thereof, including nucleotide sequences that encode an aminoacid sequence depicted in any one of SEQ ID NOs:2, 4 and 66 (e.g., thenucleotide sequences depicted in SEQ ID NOs: 1, 3, 5, 6, 7, 36, 37 and65), as well as homologs, orthologs and allelic variants of suchsequences and fragments thereof (e.g., SEQ ID NOs:38, 40, 42, 44, 46-48and 75) which encode homolog or otholog HKNG1 gene products (e.g., anypolypeptides having an amino acid sequence depicted in SEQ ID NOs:39,41, 43, 45, 49 or 76); (b) nucleotide sequences that encode one or morefunctional domains of a HKNG1 gene product, including, but not limitedto, nucleic acid sequences that encode a signal sequence domain or oneor more clusterin domains as described in Section 5.2, below; (c)nucleotide sequences that comprise HKNG1 gene sequences of upstreamuntranslated regions, intronic regions and/or downstream untranslatedregions or fragments thereof of the HKNG1 nucleotide sequences in. (a)above; (d) nucleotide sequences comprising novel HKNG1 sequencesdisclosed herein that encode mutants of the HKNG1 gene product in whichall or a part of one or more of the domains is deleted or altered, aswell as fragments thereof; (e) nucleotide sequences that encode fusionproteins comprising a HKNG1 gene product (e.g., any of the HKNG1 geneproducts depicted in SEQ ID NOs: 2, 4, 39, 41, 43, 45, 49, 65 and 76) ora portion thereof fused to a heterologous polypeptide; (f) nucleotidesequences (e.g., primers) within the HKNG1 gene and chromosome 18pnucleotide sequences flanking the HKNG1 gene which can be utilized,e.g., as part of the methods of the invention for identifying anddiagnosing individuals at risk for or exhibiting a HKNG1-mediateddisorder such as a neuropsychiatric disorder (e.g., BAD orschizophrenia) or myopia.

[0127] The HKNG1 nucleotide sequences of the invention further includenucleotide sequences corresponding to the nucleotide sequences of(a)-(f), above, wherein one or more of the exons, or fragments thereof,have been deleted. For example, in one preferred embodiment, the HKNG1nucleotide sequence of the invention is a sequence wherein the exoncorresponding to Exon 7 of SEQ ID NO:7, or a fragment thereof, has beendeleted. In another exemplary preferred embodiment, the HKNG1 nucleotidesequence of the invention is a sequence wherein the exon correspondingto Exon 10 of SEQ ID NO:7, or a fragment thereof, has been deleted.

[0128] The HKNG1 nucleotide sequences of the invention also includenucleotide sequences that have at least 65%, 70%, 75%, 80%, 85%, 90%,95%, 98%, 99% or more nucleotide sequence identity to the HKNG1nucleotide sequences of (a)-(f) above. The HKNG1 nucleotide sequences ofthe invention further include nucleotide sequences that encodepolypeptides having at least 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99%or higher amino acid sequence identity to the polypeptides encoded bythe HKNG1 nucleotide sequences of (a)-(f), e.g., SEQ ID NOs: 2, 4, 39,41, 43, 45, 49, and 66 above.

[0129] To determine the percent identity of two amino acid sequences orof two nucleic acids, the sequences are aligned for optimal comparisonpurposes (e.g., gaps can be introduced in the sequence of a first aminoacid or nucleic acid sequence for optimal alignment with a second aminoor nucleic acid sequence). The amino acid residues or nucleotides atcorresponding amino acid positions or nucleotide positions are thencompared. When a position in the first sequence is occupied by the sameamino acid residue or nucleotide as the corresponding position in thesecond sequence, then the molecules are identical at that position. Thepercent identity between the two sequences is a function of the numberof identical positions shared by the sequences (i.e., % identity=# ofidentical overlapping positions/total # of positions×100%). In oneembodiment, the two sequences are the same length.

[0130] The determination of percent identity between two sequences canalso be accomplished using a mathematical algorithm. A preferred,non-limiting example of a mathematical algorithm utilized for thecomparison of two sequences is the algorithm of Karlin and Altschul(1990) Proc. Natl. Acad. Sci. USA 87:2264-2268, modified as in Karlinand Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5877. Such analgorithm is incorporated into the NBLAST and XBLAST programs ofAltschul, et al. (1990) J. Mol. Biol. 215:403-410. BLAST nucleotidesearches can be performed with the NBLAST program, score=100,wordlength=12 to obtain nucleotide sequences homologous to a nucleicacid molecules of the invention. BLAST protein searches can be performedwith the XBLAST program, score=50, wordlength=3 to obtain amino acidsequences homologous to protein molecules of the invention. To obtaingapped alignments for comparison purposes, Gapped BLAST can be utilizedas described in Altschul et al. (1997) Nucleic Acids Res.25:3389-3402.Alternatively, PSI-Blast can be used to perform an iterated search whichdetects distant relationships between molecules (Id.). When utilizingBLAST, Gapped BLAST, and PSI-Blast programs, the default parameters ofthe respective programs (e.g., XBLAST and NBLAST) can be used (seehttp://www.ncbi.nlm.nih.gov). Another preferred, non-limiting example ofa mathematical algorithm utilized for the comparison of sequences is thealgorithm of Myers and Miller, (1988) CABIOS 4:11-17. Such an algorithmis incorporated into the ALIGN program (version 2.0) which is part ofthe GCG sequence alignment software package. When utilizing the ALIGNprogram for comparing amino acid sequences, a PAM120 weight residuetable, a gap length penalty of 12, and a gap penalty of 4 can be used.

[0131] The percent identity between two sequences can be determinedusing techniques similar to those described above, with or withoutallowing gaps. In calculating percent identity, typically only exactmatches are counted.

[0132] The HKNG1 nucleotide sequences of the invention further includeany nucleotide sequence that hybridizes to a HKNG1 nucleic acid moleculeof the invention: (a) under stringent conditions, e.g., hybridization tofilter-bound DNA in 633 sodium chloride/sodium citrate (SSC) at about45° C. followed by one or more washes in 0.2×SSC/0.1% SDS at about50-65° C.; or (b) under highly stringent conditions, e.g., hybridizationto filter-bound nucleic acid in 6×SSC at about 45° C. followed by one ormore washes in 0.1×SSC/0.2% SDS at about 68° C., or under otherhybridization conditions which are apparent to those of skill in the art(see, for example, Ausubel F. M. et al., eds., 1989, Current Protocolsin Molecular Biology, Vol. I, Green Publishing Associates, Inc., andJohn Wiley & Sons, Inc., New York, at pp. 6.3.1-6.3.6 and 2.10.3).Preferably, the HKNG1 nucleic acid molecule that hybridizes to thenucleotide sequence of (a) and (b), above, is one that comprises thecomplement of a nucleic acid molecule that encodes a HKNG1 gene product.In a preferred embodiment, nucleic acid molecules comprising thenucleotide sequences of (a) and (b), above, encode gene products, e.g.,gene products functionally equivalent to an HKNG1 gene product.

[0133] Functionally equivalent HKNG1 gene products include naturallyoccurring HKNG1 gene products present in the same or different species.In one embodiment, HKNG1 gene sequences in non-human species map tochromosome regions syntenic to the human 18p chromosome location withinwhich human HKNG1 lies. Functionally equivalent HKNG1 gene products alsoinclude gene products that retain at least one of the biologicalactivities of the HKNG1 gene products, and/or which are recognized byand bind to antibodies (polyclonal or monoclonal) directed against theHKNG1 gene products.

[0134] Among the nucleic acid molecules of the invention aredeoxyoligonucleotides (“oligos”) which hybridize under highly stringentor stringent conditions to the HKNG1 nucleic acid molecules describedabove. In general, for probes between 14 and 70 nucleotides in lengththe melting temperature (TM) is calculated using the formula: Tm(°C.)=81.5+16.6(log[monovalent cations (molar)])+0.41 (% G+C)−(500/N)where N is the length of the probe. If the hybridization is carried outin a solution containing formamide, the melting temperature iscalculated using the equation Tm(° C.)=81.5+16.6(log[monovalent cations(molar)])+0.41 (% G+C)−0.61 (% fornamide)−(500/N) where N is the lengthof the probe. In general, hybridization is carried out at about 20-25degrees below Tm (for DNA-DNA hybrids) or 10-15 degrees below Tm (forRNA-DNA hybrids).

[0135] Exemplary highly stringent conditions for deoxyoligonucleotidesmay comprise, e.g., washing in 6×SSC/0.05% sodium pyrophosphate at 37°C. (for about 14-base oligos), 48° C. (for about 17-base oligos), 55° C.(for about 20-base oligos), and 60° C. (for about 23-base oligos).

[0136] These nucleic acid molecules may encode or act as antisensemolecules, useful, for example, in HKNG1 gene regulation, and/or asantisense primers in amplification reactions of HKNG1 gene nucleic acidsequences. Further, such sequences may be used as part of ribozymeand/or triple helix sequences, also useful for HKNG1 gene regulation.Still further, such molecules may be used as components of diagnosticmethods whereby, for example, the presence of a particular HKNG1 alleleinvolved in a HKNG1-related disorder, e.g., a neuropsychiatric disorder,such as BAD, may be detected.

[0137] Fragments of the HKNG1 nucleic acid molecules can be at least 10nucleotides in length. In alternative embodiments, the fragments can beabout 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 1000,1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, or more contiguousnucleotides in length. Alternatively, the fragments can comprisesequences that encode at least 10, 20, 30, 40, 50, 100, 150, 200, 250,300, 350, 400, 450 or more contiguous amino acid residues of the HKNG1gene products. Fragments of the HKNG1 nucleic acid molecules can alsorefer to HKNG1 exons or introns, and, further, can refer to portions ofHKNG1 coding regions that encode domains (e., clusterin domains) ofHKNG1 gene products.

5.1.2. THE GNKH GENE

[0138] Unless otherwise stated, the term “GNKH nucleic acid” or “GNKHgene” is understood to refer collectively to those nucleic acidsequences described in this subsection, as well as to allelic variantsand polymorphisms of those sequences such as the allelic variants andpolymorphisms described, below, in Section 5.1.3. In particular, thecDNA sequence of a novel human GNKH gene is provided, herein, in FIG. 28(SEQ ID NO:74). The sequence contains at least two open reading frames(“ORFs”) which encode polypeptides of 123 and 111 amino acid residues,respectively. Each of these polypeptides is depicted, individually, inFIGS. 32 and 33, and in SEQ ID NOs:75-76, respectively.

[0139] The genomic structure of GNKH has also been elucidated, and isdisclosed herein in FIGS. 30A-30B (bottom sequence, SEQ ID NO:124). Inparticular, the GNKH genomic sequence depicted in FIGS. 30A-30B alignswith a portion of the HKNG1 genomic sequence, and with the genomicsequence of a second gene, TS, that lies adjacent to the HKNG1 genomicsequence on human chromosome 18p (Hori et al., 1990, Hum. Genet.85:576-580). A schematic diagram of the relationship between the genesHKNG1, TS, rTS and GNKH is shown in FIG. 31.

[0140] The genomic sequence of GNKH contains two exons of length 788 bpand 343 bp, respectively, corresponding to nucleic acid residues 888through 1669 and nucleic acid residues 9552 through 9893, respectivelyof the GNKH genomic sequence shown in SEQ ID NO:124. These two exons areseparated by an approximate 8 kb (7882 base pair) intronic region whichcorresponds to nucleic acid residues 1670 through 9551 of the GNKHgenomic sequence shown in SEQ ID NO:124.

[0141] Thus, the nucleic acid molecules of the present invention alsoinclude GNKH nucleic acid molecules, including: (a) nucleotidesequences, and fragments thereof, that encode a GNKH gene product, or afragment thereof, including sequences that encode an amino acid sequencedepicted in SEQ ID NO:75 or 76 (e.g., the nucleotide sequences depictedin SEQ ID NOs:74 and 102); (b) nucleotide sequences corresponding tofragments of a GNKH gene (e.g., fragments of SEQ ID NOs:74 and 102) thatare at least 402 nucleotides in length or, alternatively, at least 458nucleotides in length; (c) nucleotide sequences that encode one or morefunctional domains of a GNKH gene product; (d) nucleotide sequences thatcomprise GNKH gene sequences of upstream untranslated regions, intronicregions and/or downstream untranslated regions, or fragments thereof, ofthe GNKH nucleotide sequence in (a), above; (e) nucleotide sequencescomprising the novel GNKH sequences disclosed herein that encode mutantsof the GNKH gene product in which all or a part of one or more of thedomains is deleted or altered, as well as fragments thereof, (f)nucleotide sequences that encode fusion proteins comprising a GNKH geneproduct; and (g) nucleotide sequences (e.g., primers) within the GNKHgene and chromosome 18p nucleotide sequences flanking the GNKH genewhich can be utilized, e.g., as part of the methods of the invention foridentifying and diagnosing individuals at risk for or exhibiting aGNKH-mediated disorder such as a neuropsychiatric disorder (e.g., BAD orschizophrenia).

[0142] The GNKH nucleotide sequences of the invention further includenucleotide sequences corresponding to the nucleotide sequences of (a)through (g), above, wherein one or more of the exons, or fragmentsthereof, have been deleted.

[0143] The GNKH nucleotide sequences of the invention also includenucleotide sequences that have at least 65%, 70%, 75%, 80%, 85%, 90%,95%, 98%, 99% or more nucleotide sequence identity to the GNKHnucleotide sequences of (a) through (g), above. Further, the GNKHnucleotide sequences of the invention also include nucleotide sequencesthat encode polypeptides having at least 65%, 70%, 75%, 80%, 85%, 90%,95%, 98%, 99% or higher amino acid sequence identity to the polypeptidesencoded by the GNKH nucleotide sequences of (a) through (g), above(e.g., polypeptides depicted in SEQ ID NOs: 75 and 76). The percentidentity of two amino acid sequences or of two nucleic acid sequencescan be readily determined, as described in Section 5.1.1, above, forHKNG1 nucleotide and polypeptide sequences.

[0144] The GNKH nucleotide sequences of the invention further includeany nucleotide sequence that hybridizes to a GNKH nucleic acid moleculeof the invention: (a) under stringent conditions, e.g., hybridization tofilter-bound DNA in 6x sodium chloride/sodium citrate (SSC) at about 45°C. followed by one or more washes in 0.2×SSC/0.1% SDS at about 50-65°C.; or (b) under highly stringent conditions, e.g., hybridization tofilter-bound nucleic acid in 6×SSC at about 45° C. followed by one ormore washes in 0.1×SSC/0.2% SDS at about 68° C., or under otherhybridization conditions which are apparent to those of skill in the art(see, for example, Ausubel F. M. et al., eds., 1989, Current Protocolsin Molecular Biology, Vol. I, Green Publishing Associates, Inc., andJohn Wiley & Sons, Inc., New York, at pp. 6.3.1-6.3.6 and 2.10.3).Preferably the GNKH nucleic acid molecule that hybridizes to thenucleotide sequence of (a) and (b), above, is one that comprises thecomplement of a nucleic acid molecule that encodes a GNKH gene product.In a preferred embodiment, nucleic acid molecules comprising thenucleotide sequences of (a) and (b), above, encode gene products, e.g.,gene products functionally equivalent to an GNKH gene product.

[0145] Functionally equivalent GNKH gene products include naturallyoccurring GNKH gene products present in the same or different species.In one embodiment, GNKH gene sequences in non-human species map tochromosome regions syntenic to the human 18p chromosome location withinwhich human GNKH lies. In another embodiment, GNKH gene sequences innon-human species map to a strand of a chromosome of the organism thatis opposite an ortholog or homolog HKNG1, TS or rTS sequence of thatorganism. Functionally equivalent GNKH gene products also include geneproducts that retain at least one of the biological activities of theGNKH gene products, and/or which are recognized by and bind toantibodies (polyclonal or monoclonal) directed against the GNKH geneproducts.

[0146] Among the nucleic acid molecules of the invention aredeoxyoligonucleotides (“oligos”) which hybridize under highly stringentor stringent conditions to the GNKH nucleic acid molecules describedabove. Appropriate, exemplary highly stringent and stringenthybridization conditions for such oligo sequences include the stringentand highly stringent hybridization conditions discussed, above, insubsection 5.1.1

[0147] These nucleic acid molecules may encode or act as antisensemolecules, useful, for example, in GNKH gene regulation, and/or asantisense primers in amplification reactions of GNKH gene nucleic acidsequences. Further, such sequences may be used as part of ribozymeand/or triple helix sequences, also useful for GNKH gene regulation.Still further, such molecules may be used as components of diagnosticmethods whereby, for example, the presence of a particular GNKH alleleinvolved in a GNKH-related disorder (e.g., a neuropsychiatric disorder,such as BAD), may be detected.

[0148] Fragments of the GNKH nucleic acid molecules can be at least 10nucleotides in length. In alternative embodiments, the fragments can beabout 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 1000,1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, or more contiguousnucleotides in length. Alternatively, the fragments can comprisesequences that encode at least 10, 20, 30, 40, 50, 100, 150, 200, 250,300, 350, 400, 450 or more contiguous amino acid residues of the GNKHgene products. Fragments of the GNKH nucleic acid molecules can alsorefer to GNKH exons or introns, and, further, can refer to portions ofGNKH coding regions that encode domains of GNKH gene products.

5.1.3. THE TS GENE

[0149] Unless otherwise stated, the term “TS nucleic acid” or “TS gene”is understood to refer collectively to those sequences described in thissubsection as well as to allelic variants and polymorphisms of thosesequences such as the allelic variants and polymorphisms described,below, in Section 5.1.3. In particular, the genomic structure of thehuman TS gene has been elucidated and is depicted in FIG. 44A-G and inSEQ ID NO:140 (Kaneda et al. J. Biol. Chem. 265 (33), 20277-20284(1990): MEDLINE 91056070). The intronic structure of the human TS genehas also been elucidated and is also disclosed in FIGS. 44A-G. The exonsof the human TS gene are also depicted, schematically, in FIG. 44A-G.

[0150] The genomic sequence of TS contains seven exons, corresponding tonucleic acid residues 1001 through 1205, nucleic acid residues 2895through 2968, nucleic acid residues 5396 through 5570, nucleic acidresidues 11843 through 11944, nucleic acid residues 13449 through 13624,nucleic acid residues 14133 through 14204, and nucleic acid residues15613 through 15750, respectively, of SEQ ID NO:140. These seven exonsare separated by intronic regions which correspond to nucleic acidresidues 1206 through 2894, nucleic acid residues 2969 through 5395,nucleic acid residues 5571 through 11842, nucleic acid residues 11945through 13448, nucleic acid residues 13625 through 14132, and nucleicacid residues 14205 through 15612, respectively of SEQ ID NO:140.

[0151] A human TS cDNA sequence (SEQ ID NO:141) encoding the full lengthamino acid sequence (SEQ ID NO:142) of the TS polypeptide is depicted inFIGS. 45A-B. This human TS gene encodes a transmembrane polypeptide of313 amino acid residues, as shown in FIG. 45B and in SEQ ID NO:142. Thenucleotide sequence of the portion of this full length human TS cDNAcorresponding to the open reading frame (“ORF”) encoding this TS geneproduct is depicted as SEQ ID NO:143.

[0152]FIG. 46 depicts a hydropathy plot of human TS protein. Relativelyhydrophobic residues are above the horizontal line, and relativelyhydrophilic residues are below the horizontal line. The cysteineresidues (cys) and potential N-glycosylation sites (Ngly) are indicatedby short vertical lines just below the hydropathy trace.

[0153] In one embodiment, human TS protein is a transmembrane proteinthat contains extracellular domains at amino acid residues 1-186 and244-313 of SEQ ID NO:142 (SEQ ID NO:144 and SEQ ID NO:145,respectively), transmembrane domains at amino acid residues 187 to 204and 219-243 of SEQ ID NO:142 (SEQ ID NO:146 and SEQ ID NO:147,respectively), and a cytoplasmic domain at amino acid residues 205-218of SEQ ID NO:142 (SEQ ID NO:149). Alternatively, in another embodiment,a human TS protein contains an extracellular domain at amino acidresidues 205 to 218 of SEQ ID NO:142 (SEQ ID NO:150), transmembranedomains at amino acid residues 187 to 204 and 219-243 of SEQ ID NO:142(SEQ ID NO:150 and SEQ ID NO:151, respectively), and cytoplasmic domainsat amino acid residues 1-186 and 244-313 of SEQ ID NO:142 (SEQ ID NO:152and SEQ ID NO:153, respectively).

[0154] Human TS protein has one N-glycosylation site with the sequenceNGSR (at amino acid residues 112 to 115 of SEQ ID NO:142).

[0155] Human TS protein has one glycosaminoglycan attachment site withthe sequence SGQG (at amino acid residues 154 to 157 of SEQ ID NO:142).

[0156] Six protein kinase C phosphorylation sites are present in humanTS protein. The first has the sequence SLR (at amino acid residues 66 to68 of SEQ ID NO:142), the second has the sequence TTK (at amino acidresidues 75 to 77 of SEQ ID NO:142), the third has the sequence SSK (atamino acid residues 102 to 104 of SEQ ID NO:142), the fourth has thesequence STR (at amino acid residues 124 to 126 of SEQ ID NO:142), thefifth has the sequence TIK (at amino acid residues 167 to 169 of SEQ IDNO:142), and the sixth has the sequence TIK (at amino acid residues 306to 308 SEQ ID NO:142).

[0157] Human TS protein has four casein kinase II phosphorylation sites.The first has the sequence SLRD (at amino acid residues 66 to 69 of SEQID NO:142), the second has the sequence STRE (at amino acid residues 124to 127 of SEQ ID NO:142), the third has the sequence TNPD (at amino acidresidues 170 to 173 of SEQ ID NO:142), and the fourth has the sequenceTLGD (at amino acid residues 251 to 308 of SEQ ID NO:142).

[0158] Human TS protein has a tyrosine kinase phosphorylation site withthe sequence RDMESDY (at amino acid residues 147 to 153 of SEQ IDNO:142).

[0159] Human TS protein 330 has three N-myristoylation sites. The firsthas the sequence GSTNAK (at amino acid residues 94 to 99 of SEQ IDNO:142), the second has the sequence GVPFNI (at amino acid residues 222to 227 of SEQ ID NO:142), and the third has the sequence GLKPGD (atamino acid residues 242 to 247 SEQ ID NO:142).

[0160] Human TS protein has a thymidylate synthase active site with thesequence LPPCHALCQFYV (at amino acid residues 192 to 203 of SEQ IDNO:142).

[0161] Thus, the nucleic acid molecules of the present invention alsoinclude TS nucleic acid molecules, including: (a) nucleotide sequences,and fragments thereof, that encode a TS gene product, or a fragmentthereof, including sequences that encode an amino acid sequence depictedin SEQ ID NO:142 (e.g., the nucleotide sequence depicted in SEQ IDNO:143); (b) nucleotide sequences corresponding to fragments of a TSgene (e.g., fragments of SEQ ID NO:142) that are at least 71, 73, 101,137, 174, 175, or 204 nucleotides in length (corresponding to thelengths of Exons 6, 2, 4, 7, 3, 5, and 1, respectively; (c) nucleotidesequences that encode one or more functional domains of a TS geneproduct; (d) nucleotide sequences that comprise TS gene sequences ofupstream untranslated regions, intronic regions and/or downstreamuntranslated regions, or fragments thereof, of the TS nucleotidesequence in (a), above; (e) nucleotide sequences comprising the novel TSsequences disclosed herein that encode mutants of the TS gene product inwhich all or a part of one or more of the domains is deleted or altered,as well as fragments thereof, (f) nucleotide sequences that encodefusion proteins comprising a TS gene product; and (g) nucleotidesequences (e.g., primers) within the TS gene and chromosome 18pnucleotide sequences flanking the TS gene which can be utilized, e.g.,as part of the methods of the invention for identifying and diagnosingindividuals at risk for or exhibiting a TS-mediated disorder such as aneuropsychiatric disorder (e.g., BAD or schizophrenia).

[0162] The TS nucleotide sequences of the invention further includenucleotide sequences corresponding to the nucleotide sequences of (a)through (g), above, wherein one or more of the exons, or fragmentsthereof, have been deleted.

[0163] The TS nucleotide sequences of the invention also includenucleotide sequences that have at least 65%, 70%, 75%, 80%, 85%, 90%,95%, 98%, 99% or more nucleotide sequence identity to the TS nucleotidesequences of (a) through (g), above. Further, the TS nucleotidesequences of the invention also include nucleotide sequences that encodepolypeptides having at least 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99%or higher amino acid sequence identity to the polypeptides encoded bythe TS nucleotide sequences of (a) through (g), above (e.g., thepolypeptide depicted in SEQ ID NO:142). The percent identity of twoamino acid sequences or of two nucleic acid sequences can be readilydetermined, as described in Section 5.1.1, above, for HKNG1 nucleotideand polypeptide sequences.

[0164] The TS nucleotide sequences of the invention further include anynucleotide sequence that hybridizes to a TS nucleic acid molecule of theinvention: (a) under stringent conditions, e.g., hybridization tofilter-bound DNA in 6× sodium chloride/sodium citrate (SSC) at about 45°C. followed by one or more washes in 0.2×SSC/0.1% SDS at about 50-65°C.; or (b) under highly stringent conditions, e.g., hybridization tofilter-bound nucleic acid in 6×SSC at about 45° C. followed by one ormore washes in 0.1×SSC/0.2% SDS at about 68° C., or under otherhybridization conditions which are apparent to those of skill in the art(see, for example, Ausubel F. M. et al., eds., 1989, Current Protocolsin Molecular Biology, Vol. I, Green Publishing Associates, Inc., andJohn Wiley & Sons, Inc., New York, at pp. 6.3.1-6.3.6 and 2.10.3).Preferably the TS nucleic acid molecule that hybridizes to thenucleotide sequence of (a) and (b), above, is one that comprises thecomplement of a nucleic acid molecule that encodes a TS gene product. Ina preferred embodiment, nucleic acid molecules comprising the nucleotidesequences of (a) and (b), above, encode gene products, e.g., geneproducts functionally equivalent to an TS gene product.

[0165] Functionally equivalent TS gene products include naturallyoccurring TS gene products present in the same or different species. Inone embodiment, TS gene sequences in non-human species map to chromosomeregions syntenic to the human 18p chromosome location within which humanTS lies. In another embodiment, TS gene sequences in non-human speciesmap to a strand of a chromosome of the organism that is opposite anortholog or homolog HKNG1, or TS sequence of that organism. Functionallyequivalent TS gene products also include gene products that retain atleast one of the biological activities of the TS gene products, and/orwhich are recognized by and bind to antibodies (polyclonal ormonoclonal) directed against the TS gene products.

[0166] Among the nucleic acid molecules of the invention aredeoxyoligonucleotides (“oligos”) which hybridize under highly stringentor stringent conditions to the TS nucleic acid molecules describedabove. Appropriate, exemplary highly stringent and stringenthybridization conditions for such oligo sequences include the stringentand highly stringent hybridization conditions discussed, above, insubsection 5.1.1

[0167] These nucleic acid molecules may encode or act as antisensemolecules, useful, for example, in TS gene regulation, and/or asantisense primers in amplification reactions of TS gene nucleic acidsequences. Further, such sequences may be used as part of ribozymeand/or triple helix sequences, also useful for TS gene regulation. Stillfurther, such molecules may be used as components of diagnostic methodswhereby, for example, the presence of a particular TS allele involved ina TS-related disorder (e.g., a neuropsychiatric disorder, such as BAD),may be detected.

[0168] Fragments of the TS nucleic acid molecules can be at least 10nucleotides in length. In alternative embodiments, the fragments can beabout 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 1000,1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, or more contiguousnucleotides in length. Alternatively, the fragments can comprisesequences that encode at least 10, 20, 30, 40, 50, 100, 150, 200, 225,250, 275, 300, 315, or 313 contiguous amino acid residues of the TS geneproducts. Fragments of the TS nucleic acid molecules can also refer toTS exons or introns, and, further, can refer to portions of TS codingregions that encode domains of TS gene products.

5.1.4. POLYMORPHISMS AND ALLELIC VARIANTS

[0169] As will be appreciated by those skilled in the art, DNA sequencepolymorphisms of a HKNG1, GNKH and/or a TS gene will exist within apopulation of individual organisms (e.g., within a human population).Polymorphisms may exist, for example, among individuals in a populationdue to natural allelic variation, and include, e.g., polymorphisms thatlead to changes in the amino acid sequence of a HKNG1, GNKH or a TS geneproduct, as well as “silent” polymorphisms that do not lead to changesin the amino acid sequence of a HKNG1, GNKH or a TS gene product.

[0170] As the term is used both herein and in the art, an allele isunderstood to refer to one of a group of genes which occur alternativelyat a given genetic locus. Thus, an “allelic variant” is understood torefer to a nucleotide sequence which occurs at a given locus or to agene product encoded by that nucleotide sequence. Such natural allelicvariations can typically result in 1-5% variance in the nucleotidesequence of a given gene. Alternative alleles can be readily identified,e.g., by sequencing the gene of interest in a number of differentindividuals. For example, hybridization probes can be used to identifythe same genetic locus in a variety of individuals, and the geneticsequence of that locus in each individual can be obtained using standardsequencing techniques that are well known in the art. With respect toHKNG1, GNKH and TS allelic variants, any and all such nucleotidevariations and resulting amino acid polymorphisms or variations that arethe result of natural allelic variation of the HKNG1, GNKH and TS geneare intended to be within the scope of the present invention. Suchallelic variants include, but are not limited to, allelic variants thatdo not alter the functional activity of the HKNG1, GNKH or a TS geneproduct.

[0171] HKNG1 allelic-variants of the invention include, but are notlimited to, HKNG1 variants comprising the specific polymorphsimsdescribed herein, e.g., in FIGS. 5A-5C and in the examples presentedhereinbelow in Sections 8 and 18, including the specific polymorphismslisted in Tables 12A-12B. These exemplary allelic variants also includea particular variant which encodes the full length HKNG1 polypeptide(SEQ ID NO:2) wherein the glutamic acid at amino acid position 202 ofSEQ ID NO:2 is a lysine. The exemplary allelic variants further includea particular variant which encodes the splice variant HKNG1-V1polypeptide (SEQ ID NO:4) wherein the lysine amino acid at amino acidresidue position 184 of SEQ ID NO:4 is a glutamic acid.

[0172] GNKH allelic variants of the invention include, but are notlimited to, GNKH variants comprising the specific polymorphsimsdescribed herein, e.g., in the example presented in Section 17 (see,e.g., Table 9).

[0173] TS allelic variants of the invention include, but are not limitedto, TS variants comprising the specific polymorphsims described herein,e.g., in the example presented in Section 21 (see, e.g., Table 15).

[0174] With respect to the cloning of additional allelic variants of thehuman HKNG1, GNKH and/or TS genes and homologues and orthologs fromother species (e.g., guinea pig, cow, rat and mouse), the isolatedHKNG1, GNKH and TS gene sequences disclosed herein may be labeled andused to screen a cDNA library constructed from mRNA obtained fromappropriate cells or tissues (e.g., brain or retinal tissues) derivedfrom the organism (e.g., guinea pig, cow, rat and mouse) of interest.The hybridization conditions used should generally be of a lowerstringency when the cDNA library is derived from an organism differentfrom the type of organism from which the labeled sequence was derived,and can routinely be determined based on, e.g., relative relatedness ofthe target and reference organisms.

[0175] Alternatively, the labeled fragment may be used to screen agenomic library derived from the organism of interest, again, usingappropriately stringent conditions. Appropriate stringency conditionsare well known to those of skill in the art as discussed, above, inSections 5.1.1 and 5.1.2, and will vary predictably depending on thespecific organisms from which the library and the labeled sequences arederived. For guidance regarding such conditions see, for example,Sambrook, et al., 1989, Molecular Cloning, A Laboratory Manual, SecondEdition, Cold Spring Harbor Press, N.Y.; and Ausubel, et al., 1989-1999,Current Protocols in Molecular Biology, Green Publishing Associates andWiley Interscience, N.Y., both of which are incorporated herein byreference in their entirety.

[0176] Further, a HKNG1, GNKH or TS gene allelic variant may be isolatedfrom, for example, human nucleic acid, by performing PCR using twodegenerate oligonucleotide primer pools designed on the basis of aminoacid sequences within a HKNG1, GNKH or TS gene product disclosed herein.The template for the reaction may be cDNA obtained by reversetranscription of mRNA prepared from, for example, human or non-humancell lines or tissue known or suspected to express a wild type or mutantHKNG1, GNKH or TS gene allele (such as, for example, brain cells,including brain cells from individuals having BAD). In one embodiment,the allelic variant is isolated from an individual who has aHKNG1-mediated disorder. In another embodiment, the allelic variant isisolated from an individual who has a GNKH-mediated disorder. In anotherembodiment, the allelic variant is isolated from an individual who has aTS-mediated disorder. Such variants are described in the examples below.

[0177] The PCR product may be subcloned and sequenced to ensure that theamplified sequences represent the sequences of a HKNG1, GNKH or TS genenucleic acid sequence. The PCR fragment may then be used to isolate afull length cDNA clone by a variety of methods. For example, theamplified fragment may be labeled and used to screen a bacteriophagecDNA library. Alternatively, the labeled fragment may be used to isolategenomic clones via the screening of a genomic library.

[0178] PCR technology may also be utilized to isolate full length cDNAsequences. For example, RNA may be isolated, following standardprocedures, from an appropriate cellular or tissue source (i.e., oneknown, or suspected, to express a HKNG1, GNKH or TS gene, such as, forexample, brain tissue samples obtained through biopsy or post-mortem). Areverse transcription reaction may be performed on the RNA using anoligonucleotide primer specific for the most 5′ end of the amplifiedfragment for the priming of first strand synthesis. The resultingRNA/DNA hybrid may then be “tailed” with guanines using a standardterminal transferase reaction, the hybrid may be digested with RNAase H,and second strand synthesis may then be primed with a poly-C primer.Thus, cDNA sequences upstream of the amplified fragment may easily beisolated. For a review of cloning strategies that may be used, see e.g.,Sambrook et al., 1989, supra, or Ausubel et al., supra.

[0179] A cDNA of an allelic, e.g., mutant, variant of a HKNG1, GNKH orTS gene may be isolated, for example, by using PCR, a technique that iswell known to those of skill in the art. In this case, the first cDNAstrand may be synthesized by hybridizing an oligo-dT oligonucleotide tomRNA isolated from tissue known or suspected to be expressed in anindividual putatively carrying a mutant HKNG1, GNKH or TS allele, and byextending the new strand with reverse transcriptase. The second strandof the cDNA is then synthesized using an oligonucleotide that hybridizesspecifically to the 5′ end of the normal gene. Using these two primers,the product is then amplified via PCR, cloned into a suitable vector,and subjected to DNA sequence analysis through methods well known tothose of skill in the art. By comparing the DNA sequence of the mutantallele to that of the normal allele, the mutation(s) responsible for theloss or alteration of function of the mutant gene product can beascertained.

[0180] Alternatively, a genomic library can be constructed using DNAobtained from an individual suspected of or known to carry a mutantHKNG1, GNKH allele or TS, or a cDNA library can be constructed using RNAfrom a tissue known, or suspected, to express a mutant HKNG1, GNKHallele or TS allele. An unimpaired HKNG1, GNKH allele or TS gene, or anysuitable fragment thereof, may then be labeled and used as a probe toidentify the corresponding mutant allele in such libraries. Clonescontaining the mutant gene sequences may then be purified and subjectedto sequence analysis according to methods well known to those of skillin the art.

[0181] Additionally, an expression library can be constructed utilizingcDNA synthesized from, for example, RNA isolated from a tissue known, orsuspected, to express a mutant HKNG1 allele in an individual suspectedof or known to carry such a mutant allele. In this manner, gene productsmade by the putatively mutant tissue may be expressed and screened usingstandard antibody screening techniques in conjunction with antibodiesraised against the normal gene product, as described, below, in Section5.3. (For screening techniques, see, for example, Harlow and Lane, eds.,1988, “Antibodies: A Laboratory Manual”, Cold Spring Harbor Press, ColdSpring Harbor.)

[0182] In cases where a mutation results in an expressed HKNG1, GNKHallele or TS gene product with altered function (e.g., as a result of amissense or a frameshift mutation), a polyclonal set of anti-HKNG1 geneproduct antibodies, anti-GNKH gene product antibodies or anti-TS geneproduct antibodies are likely to cross-react with the mutant geneproduct. Library clones detected via their reaction with such labeledantibodies can be purified and subjected to sequence analysis accordingto methods well known to those of skill in the art.

[0183] Mutations and polymorphisms of HKNG1, GNKH and/or TS can furtherbe detected using PCR amplification techniques. Primers can routinely bedesigned to amplify overlapping regions of a whole HKNG1, GNKH or TSsequence including the promoter regulating region of a HKNG1, GNKH or TSsequence. In one embodiment, primers are designed to cover theexon-intron boundaries such that coding regions can be scanned formutations. Exemplary primers for analyzing HKNG1 exons are provided inTable 1, of Section 5.6, below, and in the Examples presentedhereinbelow.

[0184] The invention also includes nucleic acid molecules, preferablyDNA molecules, that are the complements of the nucleotide sequences ofthe preceding paragraphs.

[0185] The HKNG1, GNKH and TS nucleic acid molecules of the inventionalso comprise, in certain embodiments, heterologous sequences (e.g.,nucleotide sequences of cloning or expression vectors, and nonendogenouspromoter elements) for expressing a non-endogenous HKNG1, GNKH and/or TSnucleic acid molecules of a non-endogenous HKNG1, GNKH and/or TS geneproduct in a cell or, alternatively, for expressing an endogenous HKNG1,GNKH and/or TS gene or gene product in a cell (e.g., using anon-endogenous promoter element). In other embodiments, the HKNG1, GNKHand TS nucleic acid molecules do not include such heterologoussequences.

5.2. CHROMOSOME 18P GENE PRODUCTS

[0186] HKNG1, GNKH and TS gene products or peptide fragments thereof,can be prepared for a variety of uses. For example, such gene products,or peptide fragments thereof, can be used for the generation ofantibodies, in diagnostic assays, or for the identification of othercellular or extracellular gene products involved in the regulation ofHKNG1-mediated, GNKH-mediated or TS-mediated disorders, e.g.,neuropsychiatric disorders, such as BAD.

[0187] The gene products of the invention include, but are not limitedto, human HKNG1 gene products, e.g., polypeptides comprising the aminoacid sequences depicted in FIGS. 1A-1C, 2A-2C, 17 and 18A-18C (i.e., SEQID NOs:2, 4, 51, and 66). The gene products of the invention alsoinclude non-human, e.g., mammalian (such as bovine, guinea pig and rat),HKNG1 gene products. Such non-human HKNG1 gene products include, but arenot limited to, polypeptides comprising the amino acid sequencesdepicted in FIGS. 7-13, 35 and 38 (i.e., SEQ ID NOs:39, 41, 43, 45, 49and 76).

[0188] HKNG1 gene product, sometimes referred to herein as an “HKNG1protein” or “HKNG1 polypeptide,” includes those gene products encoded bythe HKNG1 gene sequences described in Section 5.1.1, above, including,e.g., the HKNG1 gene sequences depicted in FIGS. 1A-1C, 2A-2C, 7A-7C,13A-13C, 17 and 18A-18C, as well as gene products encoded by other humanallelic variants and non-human variants of HKNG1 that can be identifiedby the methods herein described. Among such HKNG1 gene product variantsare gene products comprising HKNG1 amino acid residues encoded byallelic variants of the HKNG1 gene, as described in Section 5.1.3, andincluding allelic variants comprising the polymorphisms depicted inFIGS. 5A-5C and in the Examples presented hereinbelow, e.g., in Sections8 and 18, including the gene products included by allelic variants ofHKNG1 comprising the polymorphisms disclosed in Tables 12A-12B. SuchHKNG1 gene product variants also include a variant of the HKNG1 geneproduct depicted in FIGS. 1A-1C (SEQ ID NO:2) wherein the amino acidresidue Lys202 is mutated to a glutamic acid residue. Such HKNG1 geneproduct variants also include a variant of the HKNG1 gene productdepicted in FIGS. 2A-2C (SEQ ID NO:4) wherein the amino acid residueLys184 is mutated to a glutamic acid residue.

[0189] The gene products of the invention also include, but are notlimited to, GNKH gene products, such as polypeptides comprising one ormore of the amino acid sequences depicted in FIGS. 32-33 (SEQ IDNOs:75-76). The GNKH gene product, sometimes referred to herein as the“GNKH protein” or “GNKH polypeptide,” includes those gene productsencoded by the GNKH gene sequences depicted in FIGS. 28 and 30A-30B (SEQID NOs:74 and 124), as well as gene products encoded by other humanallelic variants and non-human variants (e.g., orthologs and homologs)of GNKH that can be identified by the methods described hereinabove(e.g., in Section 5.1.3). Among such GNKH gene product variants are geneproducts comprising GNKH amino acid residues encoded by allelic variantsof the GNKH gene as described, above, in Section 5.1.3, and includingGNKH allelic variants comprising the specific polymorphisms describedherein, e.g., in the example presented in Section 17 (see, e.g., Table9).

[0190] The gene products of the invention also include, but are notlimited to, TS gene products, such as polypeptides comprising one ormore of the amino acid sequences depicted in FIG. 45B (SEQ ID NO:142).The TS gene product, sometimes referred to herein as the “TS protein” or“TS polypeptide,” includes those gene products encoded by the TS genesequences depicted in FIGS. 44A-G and 45A (SEQ ID NOs:140 and 141), aswell as gene products encoded by other human allelic variants andnon-human variants (e.g., orthologs and homologs) of TS that can beidentified by the methods described hereinabove (e.g., in Section5.1.3). Among such TS gene product variants are gene products comprisingTS amino acid residues encoded by allelic variants of the TS gene asdescribed, above, in Section 5.1.3, and including TS allelic variantscomprising the specific polymorphisms described herein, e.g., in theexample presented in Section 21 (see, e.g., Table 15).

[0191] In addition, HKNG1, GNKH and TS gene products of the inventionmay include proteins that represent functionally equivalent geneproducts. Functionally equivalent gene products may include, forexample, gene products encoded by one of the HKNG1, GNKH or TS nucleicacid molecules described in Section 5.1, above. In preferredembodiments, such functionally equivalent gene products are naturallyoccuring gene products. Functionally equivalent HKNG1, GNKH and TS geneproducts also include gene products that retain at least one of thebiological activities of the above-described HKNG1, GNKH and TS geneproducts, and/or which are recognized by and bind to antibodies(polyclonal or monoclonal) directed against HKNG1, GNKH or TS geneproducts.

[0192] A functionally equivalent gene product may contain deletions,including internal deletions, additions, including additions yieldingfusion proteins, or substitutions of amino acid residues within and/oradjacent to the amino acid sequence encoded by the HKNG1, GNKH and/or TSgene sequences described, above, in Section 5.1. Generally, deletionswill be deletions of single amino acid residues, or deletions of no morethan about 2, 3, 4, 5, 10 or 20 amino acid residues (either contiguousor non-contiguous amino acid residues). Generally, additions orsubstitutions, other than additions that yield fusion proteins, will beadditions or substitutions of single amino acid residues, or additionsor substitutions of no more than about 2, 3, 4, 5, 10 or 20 amino acidresidues (either contiguous or non-contiguous amino acid residues).Preferably, these modifications result in a “silent” change, in that thechange produces a HKNG1, GNKH or TS gene product with the same activityas the HKNG1, GNKH or TS gene product depicted in FIG. 1-1C, 2A-2C, 7-13or 17 (HKNG1), in FIGS. 32-33 (GNKH), or FIG. 45B (TS).

[0193] Amino acid substitutions may be made on the basis of similarityin polarity, charge, solubility, hydrophobicity, hydrophilicity, and/orthe amphipathic nature of the residues involved. For example, nonpolar(hydrophobic) amino acids include alanine, leucine, isoleucine, valine,proline, phenylalanine, tryptophan, and methionine; polar neutral aminoacids include glycine, serine, threonine, cysteine, tyrosine,asparagine, and glutamine; positively charged (basic) amino acidsinclude arginine, lysine, and histidine; and negatively charged (acidic)amino acids include aspartic acid and glutamic acid.

[0194] Alternatively, where alteration of function is desired, one ormore additions, deletions or non-conservative alterations can producealtered HKNG1, GNKH and/or TS gene products, including HKNG1, GNKHand/or TS gene products with reduced or enhanced activity. Suchalterations can, for example, alter one or more of the biologicalfunctions of the HKNG1, GNKH and/or TS gene product. Further, suchalterations can be selected so as to generate HKNG1, GNKH and/or TS geneproducts that are better suited for expression, scale up, etc. in thehost cells chosen. For example, cysteine residues can be deleted orsubstituted with another amino acid residue in order to eliminatedisulfide bridges.

[0195] As another example, altered HKNG1, GNKH and/or TS gene productscan be engineered that correspond to variants of the gene productassociated with HKNG1, GNKH and/or TS-mediated neuropsychiatricdisorders such as BAD. Specific examples of such altered gene productsinclude, but are not limited to (in the particular case of HKNG1 geneproducts), HKNG1 proteins or peptides comprising substitution of alysine residue for the wild-type glutamic acid residue at HKNG1 aminoacid position 202 in FIG. 1-1C (SEQ ID NO:2) or amino acid position 184(SEQ ID NO:4) in FIG. 2A-2C.

[0196] The protein fragments and/or peptides of the invention (i.e.,HKNG1 protein fragments and peptides, GNKH protein fragments andpeptides and TS protein fragments and peptides) comprise at least asmany contiguous amino acid residues of a HKNG1, GNKH or TS proteinsequence as are necessary to represent an epitope fragment (that is tobe recognized by an antibody directed to the HKNG1, GNKH or TS protein).For example, such protein fragments or peptides comprise at least about8 contiguous amino acid residues from a full length HKNG1, GNKH or TSprotein. In alternate embodiments, the protein fragments and peptides ofthe invention can comprise about 10, 20, 30, 40, 50, 60, 70, 80, 90,100, 150, 200, 250, 300, 350, 400, 450 or more contiguous amino acidresidues of a HKNG1, GNKH or TS protein.

[0197] Peptides and/or proteins corresponding to one or more domains ofa HKNG1, GNKH or TS protein as well as fusion proteins in which a HKNG1,GNKH or TS protein, or a portion thereof (e.g., a truncated HKNG1, GNKHor TS protein or peptide, or a HKNG1, GNKH or TS protein domain), isfused to an unrelated protein are also within the scope of thisinvention. Such proteins and peptides can be designed on the basis ofthe HKNG1, GNKH or TS nucleotide sequences disclosed in Section 5.1,above, and/or on the basis of the HKNG1, GNKH or TS amino acid sequencedisclosed in this Section. Fusion proteins include, but are not limitedto: IgFc fusions which stabilize the HKNG1, GNKH or TS protein orpeptide and prolong its half life in vivo; fusions to any amino acidsequence that allows the fusion protein to be anchored to the cellmembrane; and fusions to an enzyme, fluorescent protein, luminescentprotein, or a flag epitope protein or peptide which provides a markerfunction.

[0198] For example, the HKNG1 protein sequences described above caninclude a domain which comprises a signal sequence that targets theHKNG1 gene product for secretion. As used herein, a signal sequenceincludes a peptide of at least about 15 or 20 amino acid residues inlength which occurs at the N-terminus of secretory and membrane-boundproteins and which contains at least about 70% hydrophobic amino acidresidues such as alanine, leucine, isoleucine, phenylalanine, proline,tyrosine, tryptophan, or valine. In a preferred embodiment, a signalsequence contains at least about 10 to 40 amino acid residues,preferably about 19-34 amino acid residues, and has at least about60-80%, more preferably 65-75%, and more preferably at least about 70%hydrophobic residues. A signal sequence serves to direct a proteincontaining such a sequence to a lipid bilayer.

[0199] In one embodiment, a HKNG1 protein contains a signal sequence atabout amino acids 1 to 49 of SEQ ID NO:2. In another embodiment, a HKNG1protein contains a signal sequence at about amino acids 30-49 of SEQ IDNO:2. In yet another embodiment, a HKNG1 protein contains a signalsequence at about amino acid residues 1 to 31 of SEQ ID NO:4. In yetanother embodiment, a HKNG1 protein contains a signal sequence at aboutamino acids 12-31 of SEQ ID NO:4.

[0200] The signal sequence of a HKNG1, GNKH or TS protein is typicallycleaved during processing of the mature protein. In particular, suchsignal peptides contain processing sites that allow cleavage of thesignal sequence from the mature proteins as they pass through thesecretory pathway. Thus, the invention pertains to the described HKNG1,GNKH or TS polypeptides having a signal sequence (i.e., “immature”polypeptides), as well as to the HKNG1, GNKH or TS signal sequencesthemselves and to the HNKG1, GNKH or TS polypeptides in the absence of asignal sequence (i.e., the “mature” HKNG1, GNKH or TS cleavageproducts). It is to be understood that HKNG1, GNKH or TS polypeptides ofthe invention can further comprise polypeptides comprising any signalsequence having the above-described characteristics and a mature HKNG1,GNKH or TS polypeptide sequence.

[0201] In one embodiment, a nucleic acid sequence encoding a signalsequence of the invention can be operably linked in an expression vectorto a protein of interest, such as a protein which is ordinarily notsecreted or is otherwise difficult to isolate. The signal sequencedirects secretion of the protein, such as from a eukaryotic host intowhich the expression vector is transformed, and the signal sequence issubsequently or concurrently cleaved. The protein can then be readilypurified from the extracellular medium by art recognized methods.Alternatively, the signal sequence can be linked to the protein ofinterest using a sequence which facilitates purification, such as with aGST domain.

[0202] The HKNG1 protein sequences described above can also include oneor more domains which comprise a clusterin domain, i.e., domains whichare identical to or substantially homologous to (i.e., 65%, 75%, 80%,85%, 90%, 95%, 96%, 97%, 98%, 99% or more identical to) the domaincorresponding to amino acid residues 134 to 160 or amino acid residues334 to 362 of SEQ ID NO:2, or to the domain corresponding to amino acidresidues 105-131 or amino acid residues 305-333 of SEQ ID No:39, or tothe domain corresponding to amino acid residues 105-131 or amino acidresidues 304-332 of SEQ ID NO:49. Preferably, such domains comprisecysteine amino acid residues at positions corresponding to conservedcysteine residues of the clusterin domains of SEQ ID NOs: 2, 39 or 49.

[0203] In particular, HKNG1 protein sequences described above can alsoinclude one or more domains which comprise a conserved cysteine domain.Such a domain corresponds, for example, to the domain of cysteinescorresponding to Cys134, Cys145, Cys148, Cys153 and Cys160; or to Cys334, Cys344, Cys351, Cys354, and Cys362 of SEQ ID NO:2 (FIGS. 1A-C). Inan alternative embodiment, a conserved cysteine domain corresponds toone or more of the domains of SEQ ID NO:39 (FIG. 7A) which comprisesCys105, Cys116, Cys119, Cys124, and Cys131; or Cys314, Cys321, Cys324,and Cys332. In yet another alternative embodiment, a conserved cysteinedomain corresponds to one or more of the domains of SEQ ID NO:49 (FIG.13A) which comprises Cys105, Cys116, Cys119, Cys124, and Cys131; orCys315, Cys322, Cys325 and Cys333.

[0204] Finally, the HKNG1, GNKH and TS proteins of the invention alsoinclude HKNG1, GNKH and TS protein sequences wherein domains encoded byone or more exons of the cDNA sequence, or fragments thereof, have beendeleted. For example, in one particularly preferred embodiment, theHKNG1 proteins of the invention are proteins in which the domain(s)corresponding to those domains encoded by exon 7 of SEQ ID NO:7, orfragments thereof, have been deleted. In another exemplary preferredembodiment, the HKNG1 proteins of the invention are proteins in whichthe domain(s) corresponding to those domains encoded by Exon 10 of SEQID NO:7, or fragments thereof, have been deleted.

[0205] The HKNG1, GNKH and TS polypeptides of the invention can furthercomprise posttranslational modifications, including, but not limited toglycosylations, acetylations, and myristoylations.

[0206] The HKNG1, GNKH and TS gene products, peptide fragments thereofand fusion proteins thereof, may be produced by recombinant DNAtechnology using techniques well known in the art. Thus, methods forpreparing such gene products, polypeptides, peptides, fusion peptide andfusion polypeptides of the invention by expressing nucleic acidcontaining HKNG1, GNKH and/or TS gene sequences are described herein.Methods that are well known to those skilled in the art can be used toconstruct expression vectors containing HKNG1, GNKH and/or TS geneproduct coding sequences and appropriate transcriptional andtranslational control signals. These methods include, for example, invitro recombinant DNA techniques, synthetic techniques, and in vivogenetic recombination. See, for example, the techniques described inSambrook, et al., 1989, supra, and Ausubel, et al., 1989, supra.Alternatively, RNA capable of encoding HKNG1, GNKH and/or TS geneproduct sequences may be chemically synthesized using, for example,synthesizers. See, for example, the techniques described in“Oligonucleotide Synthesis”, 1984, Gait, ed., IRL Press, Oxford.

[0207] A variety of host-expression vector systems may be utilized toexpress the gene product coding sequences of the invention. Suchhost-expression systems represent vehicles by which the coding sequencesof interest may be produced and subsequently purified, but alsorepresent cells that may, when transformed or transfected with theappropriate nucleotide coding sequences, exhibit a gene product of theinvention in situ. These include but are not limited to microorganismssuch as bacteria (e.g., E. coli, B. subtilis) transformed withrecombinant bacteriophage DNA, plasmid DNA or cosmid DNA expressionvectors containing HKNG1, GNKH and/or TS gene product coding sequences;yeast (e.g., Saccharomyces, Pichia) transformed with recombinant yeastexpression vectors containing HKNG1, GNKH and/or TS gene product codingsequences; insect cell systems infected with recombinant virusexpression vectors (e.g., baculovirus) containing HKNG1, GNKH and/or TSgene product coding sequences; plant cell systems infected withrecombinant virus expression vectors (e.g., cauliflower mosaic virus,CaMV; tobacco mosaic virus, TMV) or transformed with recombinant plasmidexpression vectors (e.g., Ti plasmid) containing HKNG1, GNKH and/or TSgene product coding sequences; or mammalian cell systems (e.g., COS,CHO, BHK, 293, 3T3) harboring recombinant expression constructscontaining promoters derived from the genome of mammalian cells (e.g.,metallothionein promoter) or from mammalian viruses (e.g., theadenovirus late promoter; the vaccinia virus 7.5K promoter).

[0208] In bacterial systems, a number of expression vectors may beadvantageously selected depending upon the use intended for the geneproduct being expressed. For example, when a large quantity of such aprotein is to be produced, e.g., for the generation of pharmaceuticalcompositions of HKNG1, GNKH or TS gene product or for raising antibodiesto a HKNG1, GNKH or TS gene product, vectors that direct the expressionof high levels of fusion protein products that are readily purified maybe desirable. Such vectors include, but are not limited, to the E. coliexpression vector pUR278 (Ruther et al., 1983, EMBO J. 2:1791), in whichthe HKNG1, GNKH or TS gene product coding sequence may be ligatedindividually into the vector in frame with the lacZ coding region sothat a fusion protein is produced; pIN vectors (Inouye and Inouye, 1985,Nucleic Acids Res. 13:3101-3109; Van Heeke and Schuster, 1989, J. Biol.Chem. 264:5503-5509); and the like. pGEX vectors may also be used toexpress foreign polypeptides as fusion proteins with glutathioneS-transferase (GST). In general, such fusion proteins are soluble andcan easily be purified from lysed cells by adsorption toglutathione-agarose beads followed by elution in the presence of freeglutathione. The pGEX vectors are designed to include thrombin or factorXa protease cleavage sites so that the cloned target gene product can bereleased from the GST moiety.

[0209] In an insect system, Autographa californica, nuclear polyhedrosisvirus (AcNPV) is used as a vector to express foreign genes. The virusgrows in Spodoptera frugiperda cells. The HKNG1, GNKH or TS gene productcoding sequence may be cloned individually into non-essential regions(for example the polyhedrin gene) of the virus and placed under controlof an AcNPV promoter (for example the polyhedrin promoter). Successfulinsertion of the gene product coding sequence will result ininactivation of the polyhedrin gene and production of non-occludedrecombinant virus (i.e., virus lacking the proteinaceous coat coded forby the polyhedrin gene). These recombinant viruses are then used toinfect Spodoptera frugiperda cells in which the inserted gene isexpressed (e.g., see Smith, et al., 1983, J. Virol. 46:584; Smith, U.S.Pat. No. 4,215,051).

[0210] In mammalian host cells, a number of viral-based expressionsystems may be utilized. In cases where an adenovirus is used as anexpression vector, the HKNG1, GNKH or TS gene product coding sequence ofinterest may be ligated to an adenovirus transcription/translationcontrol complex, e.g., the late promoter and tripartite leader sequence.This chimeric gene may then be inserted in the adenovirus genome by invitro or in vivo recombination. Insertion in a non-essential region ofthe viral genome (e.g., region E1 or E3) will result in a recombinantvirus that is viable and capable of expressing the gene product ininfected hosts. (e.g., See Logan and Shenk, 1984, Proc. Natl. Acad. Sci.USA 81:3655-3659). Specific initiation signals may also be required forefficient translation of inserted gene product coding sequences. Thesesignals include the ATG initiation codon and adjacent sequences. Incases where an entire gene (e.g., an entire HKNG1, GNKH or TS gene),including its own initiation codon and adjacent sequences, is insertedinto the appropriate expression vector, no additional translationalcontrol signals may be needed. However, in cases where only a portion ofa gene coding sequence is inserted, exogenous translational controlsignals, including, perhaps, the ATG initiation codon, must be provided.Furthermore, the initiation codon must be in phase with the readingframe of the desired coding sequence to ensure translation of the entireinsert. These exogenous translational control signals and initiationcodons can be of a variety of origins, both natural and synthetic. Theefficiency of expression may be enhanced by the inclusion of appropriatetranscription enhancer elements, transcription terminators, etc. (seeBittner, et al., 1987, Methods in Enzymol. 153:516-544).

[0211] In addition, a host cell strain may be chosen that modulates theexpression of the inserted sequences, or modifies and processes the geneproduct in the specific fashion desired. Such modifications (e.g.,glycosylation) and processing (e.g., cleavage) of protein products maybe important for the function of the protein. Different host cells havecharacteristic and specific mechanisms for the post-translationalprocessing and modification of proteins and gene products. Appropriatecell lines or host systems can be chosen to ensure the correctmodification and processing of the foreign protein expressed. To thisend, eukaryotic host cells that possess the cellular machinery forproper processing of the primary transcript, glycosylation, andphosphorylation of the gene product may be used. Such mammalian hostcells include but are not limited to CHO, VERO, BHK, HeLa, COS, MDCK,293, 3T3, and WI38.

[0212] For long-term, high-yield production of recombinant proteins,stable expression is preferred. For example, cell lines that stablyexpress a HKNG1, GNKH or TS gene product may be engineered. Rather thanusing expression vectors that contain viral origins of replication, hostcells can be transformed with DNA controlled by appropriate expressioncontrol elements (e.g., promoter, enhancer, sequences, transcriptionterminators, polyadenylation sites, etc.), and a selectable marker.Following the introduction of the foreign DNA, engineered cells may beallowed to grow for 1-2 days in an enriched media, and then are switchedto a selective media. The selectable marker in the recombinant plasmidconfers resistance to the selection and allows cells to stably integratethe plasmid into their chromosomes and grow to form foci that in turncan be cloned and expanded into cell lines. This method mayadvantageously be used to engineer cell lines that express a HKNG1, GNKHor TS gene product. Such engineered cell lines may be particularlyuseful in screening and evaluation of compounds that affect theendogenous activity of a HKNG1, GNKH or TS gene product.

[0213] A number of selection systems may be used, including but notlimited to the herpes simplex virus thymidine kinase (Wigler, et al.,1977, Cell 11:223), hypoxanthine-guanine phosphoribosyltransferase(Szybalska and Szybalski, 1962, Proc. Natl. Acad. Sci. USA 48:2026), andadenine phosphoribosyltransferase (Lowy, et al., 1980, Cell 22:817)genes can be employed in tk-, hgprt- or aprt-cells, respectively. Also,antimetabolite resistance can be used as the basis of selection for thefollowing genes: dhfr, which confers resistance to methotrexate (Wigler,et al., 1980, Natl. Acad. Sci. USA 77:3567; O'Hare, et al., 1981, Proc.Natl. Acad. Sci. USA 78:1527); gpt, which confers resistance tomycophenolic acid (Mulligan and Berg, 1981, Proc. Natl. Acad. Sci. USA78:2072); neo, which confers resistance to the aminoglycoside G-418(Colberre-Garapin, et al., 1981, J. Mol. Biol. 150:1); and hygro, whichconfers resistance to hygromycin (Santerre, et al., 1984, Gene 30:147).

[0214] Alternatively, the expression characteristics of an endogenousHKNG1, GNKH or TS gene within a cell line or microorganism may bemodified by inserting a heterologous DNA regulatory element into thegenome of a stable cell line or cloned microorganism such that theinserted regulatory element is operatively linked with the endogenousHKNG1, GNKH or TS gene. For example, an endogenous HKNG1, GNKH or TSgene which is normally “transcriptionally silent” (i.e., an HKNG1, GNKHor TS gene which is normally not expressed, or is expressed only at verylow levels in a cell line or microorganism) may

[0215] be activated by inserting a regulatory element which is capableof promoting the expression of a normally expressed gene product in thatcell line or microorganism. Alternatively, a transcriptionally silent,endogenous HKNG1, GNKH or TS gene may be activated by insertion of apromiscuous regulatory element that works across cell types.

[0216] A heterologous regulatory element may be inserted into a stablecell line or cloned microorganism, such that it is operatively linkedwith an endogenous gene, such as an endogenous HKNG1, GNKH or TS gene,using techniques, such as targeted homologous recombination, which arewell known to those of skill in the art, and described e.g., in Chappel,U.S. Pat. No. 5,272,071; PCT publication No. WO 91/06667, published May16, 1991.

[0217] Alternatively, any fusion protein may be readily purified byutilizing an antibody specific for the fusion protein being expressed.For example, a system described by Janknecht, et al. allows for theready purification of noh-denatured fusion proteins expressed in humancell lines (Janknecht, et al., 1991, Proc. Natl. Acad. Sci. USA88:8972-8976). In this system, the gene of interest is subcloned into avaccmia recombination plasmid such that the gene's open reading frame istranslationally fused to an amino-terminal tag consisting of sixhistidine residues. Extracts from cells infected with recombinantvaccinia virus are loaded onto Ni²⁺ nitriloacetic acid-agarose columnsand histidine-tagged proteins are selectively eluted withimidazole-containing buffers.

[0218] The HKNG1, GNKH and/or TS gene products can also be expressed intransgenic animals. Animals of any species, including, but not limitedto, mice, rats, rabbits, guinea pigs, pigs, micro-pigs, goats, sheep,cows, and non-human primates, e.g., baboons, monkeys, and chimpanzeesmay be used to generate HKNG1, GNKH and/or TS transgenic animals. Theterm “transgenic” as used herein, refers to animals expressing HKNG1,GNKH and/or TS gene sequences from a different species (e.g., miceexpressing human HKNG1, GNKH and/or TS gene sequences); animals thathave been genetically engineered to overexpress endogenous (i.e., samespecies) HKNG1, GNKH and/or TS sequences; and animals that have beengenetically engineered to no longer express endogenous HKNG1, GNKHand/or TS gene sequences (i.e., “knock-out” animals), and their progeny.

[0219] Any technique known in the art may be used to introduce a HKNG1,GNKH or TS gene transgene into animals to produce the founder lines oftransgenic animals. Such techniques include, but are not limited topronuclear microinjection (Hoppe and Wagner, 1989, U.S. Pat. No.4,873,191); retrovirus mediated gene transfer into germ lines (Van derPutten, et al., 1985, Proc. Natl. Acad. Sci., USA 82:6148-6152); genetargeting in embryonic stem cells (Thompson, et al., 1989, Cell56:313-321); electroporation of embryos (Lo, 1983, Mol. Cell. Biol.3:1803-1814); and sperm-mediated gene transfer (Lavitrano et al., 1989,Cell 57:717-723) (For a review of such techniques, see Gordon, 1989,Transgenic Animals, Intl. Rev. Cytol. 115, 171-229)

[0220] Any technique known in the art may be used to produce transgenicanimal clones containing a HKNG1 transgene, for example, nucleartransfer into enucleated oocytes of nuclei from cultured embryonic,fetal or adult cells induced to quiescence (Campbell, et al., 1996,Nature 380:64-66; Wilmut, et al., Nature 385:810-813).

[0221] The present invention provides for transgenic animals that carrya HKNG1 transgene, GNKH transgene and/or a TS transgene in all theircells, as well as animals that carry the HKNG1, GNKH and/or TStransgenes in some, but not all their cells (i.e., mosaic animals). AnHKNG1, GNKH or TS transgene may be integrated as a single transgene orin concatamers, e.g., head-to-head tandems or head-to-tail tandems. Thetransgene may also be selectively introduced into and activated in aparticular cell type by following, for example, the teaching of Lasko etal. (Lasko, et al., 1992, Proc. Natl. Acad. Sci. USA 89:6232-6236). Theregulatory sequences required for such a cell-type specific activationwill depend upon the particular cell type of interest, and will beapparent to those of skill in the art. When it is desired that a HKNG1,GNKH or TS transgene be integrated into the chromosomal site of theendogenous HKNG1, GNKH or TS gene, gene targeting is preferred. Briefly,when such a technique is to be utilized, vectors containing somenucleotide sequences homologous to the endogenous gene are designed forthe purpose of integrating, via homologous recombination withchromosomal sequences, into and disrupting the function of thenucleotide sequence of the endogenous gene. The transgene may also beselectively introduced into a particular cell type, thus inactivatingthe endogenous gene in only that cell type, by following, for example,the teaching of Gu, et al. (Gu, et al., 1994, Science 265, 103-106). Theregulatory sequences required for such a cell-type specific inactivationwill depend upon the particular cell type of interest, and will beapparent to those of skill in the art.

[0222] Methods for generating transgenic animals via embryo manipulationand microinjection, particularly animals such as mice, have becomeconventional in the art and are described, for example, in U.S. Pat.Nos. 4,736,866 and 4,870,009, U.S. Pat. No. 4,873,191 and in Hogan,Manipulating the Mouse Embryo, (Cold Spring Harbor Laboratory Press,Cold Spring Harbor, N.Y., 1986) and Wakayama et al., (1999), Proc. Natl.Acad. Sci. USA, 96:14984-14989. Similar methods are used for productionof other transgenic animals. A transgenic founder animal can beidentified based upon the presence of the transgene in its genome and/orexpression of mRNA encoding the transgene in tissues or cells of theanimals. A transgenic founder animal can then be used to breedadditional animals carrying the transgene. Moreover, transgenic animalscarrying the transgene can further be bred to other transgenic animalscarrying other transgenes.

[0223] To create an homologous recombinant animal, a vector is preparedwhich contains at least a portion of a gene encoding a polypeptide ofthe invention into which a deletion, addition or substitution has beenintroduced to thereby alter, e.g., functionally disrupt, the gene. Inone embodiment, the vector is designed such that, upon homologousrecombination, the endogenous gene is functionally disrupted (i.e., nolonger encodes a functional protein; also referred to as a “knock out”vector). Alternatively, the vector can be designed such that, uponhomologous recombination, the endogenous gene is mutated or otherwisealtered but still encodes functional protein (e.g., the upstreamregulatory region can be altered to thereby alter the expression of theendogenous protein). In the homologous recombination vector, the alteredportion of the gene is flanked at its 5′ and 3′ ends by additionalnucleic acid of the gene to allow for homologous recombination to occurbetween the exogenous gene carried by the vector and an endogenous genein an embryonic stem cell. The additional flanking nucleic acidsequences are of sufficient length for successful homologousrecombination with the endogenous gene. Typically, several kilobases offlanking DNA (both at the 5′ and 3′ ends) are included in the vector(see, e.g., Thomas and Capecchi (1987) Cell 51:503 for a description ofhomologous recombination vectors). The vector is introduced into anembryonic stem cell line (e.g., by electroporation) and cells in whichthe introduced gene has homologously recombined with the endogenous geneare selected (see, e.g., Li et al. (1992) Cell 69:915). The selectedcells are then injected into a blastocyst of an animal (e.g., a mouse)to form aggregation chimeras (see, e.g., Bradley in Teratocarcinomas andEmbryonic Stem Cells: A Practical Approach, Robertson, ed. (IRL, Oxford,1987) pp. 113-152). A chimeric embryo can then be implanted into asuitable pseudopregnant female foster animal and the embryo brought toterm. Progeny harboring the homologously recombined DNA in their germcells can be used to breed animals in which all cells of the animalcontain the homologously recombined DNA by germline transmission of thetransgene. Methods for constructing homologous recombination vectors andhomologous recombinant animals are described further in Bradley (1991)Current Opinion in Bio/Technology 2:823-829 and in PCT Publication NOs.WO 90/11354, WO 91/01140, WO 92/0968, and WO 93/04169.

[0224] In another embodiment, transgenic non-human animals can beproduced which contain selected systems which allow for regulatedexpression of the transgene. One example of such a system is thecre/loxP recombinase system of bacteriophage P1. For a description ofthe cre/loxP recombinase system, see, e.g., Lakso et al. (1992) Proc.Natl. Acad. Sci. USA 89:6232-6236. Another example of a recombinasesystem is the FLP recombinase system of Saccharomyces cerevisiae(O'Gorman et al. (1991) Science 251:1351-1355. If a cre/loxP recombinasesystem is used to regulate expression of the transgene, animalscontaining transgenes encoding both the Cre recombinase and a selectedprotein are required. Such animals can be provided through theconstruction of “double” transgenic animals, e.g., by mating twotransgenic animals, one containing a transgene encoding a selectedprotein and the other containing a transgene encoding a recombinase.

[0225] Clones of the non-human transgenic animals described herein canalso be produced according to the methods described in Wilmut et al.(1997) Nature 385:810-813 and PCT Publication NOs. WO 97/07668 and WO97/07669.

[0226] Once transgenic animals have been generated, the expression ofthe recombinant gene may be assayed utilizing standard techniques.Initial screening may be accomplished by Southern blot analysis or PCRtechniques to analyze animal tissues to assay whether integration of thetransgene has taken place. The level of mRNA expression of the transgenein the tissues of the transgenic animals may also be assessed usingtechniques that include but are not limited to Northern blot analysis oftissue samples obtained from the animal, in situ hybridization analysis,and RT-PCR (reverse transcriptase PCR). Samples of HKNG1, GNKH and/or TSgene-expressing tissue, may also be evaluated immuno-cytochemicallyusing antibodies specific for the HKNG1, GNKH or TS transgene product.

5.3. ANTIBODIES TO CHROMOSOME 18P GENE PRODUCTS

[0227] Described herein are methods for the production of antibodiescapable of specifically recognizing one or more epitopes of the geneproducts of the present invention (i.e., HKNG1, GNKH and TS geneproducts) or epitopes of conserved variants or peptide fragments ofthese gene products. Further, antibodies that specifically recognizemutant forms of HKNG1, GNKH and TS gene products, are encompassed by theinvention. The terms “specifically bind” and “specifically recognize”refer to antibodies that bind to HKNG1, GNKH and TS gene productepitopes at a higher affinity than they bind to non-HKNG1, non-GNKH ornon-TS (e.g., random) epitopes. Thus, for example, an antibody thatspecifically binds to, and thereby specifically recognizes, an HKNG1gene product is one that binds to the HKNG1 gene product at a higheraffinity than it binds to a non-HKNG1 gene product. Likewise, anantibody that specifically binds to, and thereby recognizes, a GNKH geneproduct is one that binds to the GNKH gene product at a higher affinitythan it binds to a non-GNKH gene product. Likewise, an antibody thatspecifically binds to, and thereby recognizes, a TS gene product is onethat binds to the TS gene product at a higher affinity than it binds toa non-TS gene product.

[0228] Such antibodies may include, but are not limited to, polyclonalantibodies, monoclonal antibodies (mAbs), humanized or chimericantibodies, single chain antibodies, Fab fragments, F(ab′)₂ fragments,fragments produced by a Fab expression library, anti-idiotypic (anti-Id)antibodies, and epitope-binding fragments of any of the above, includingthe polyclonal and monoclonal antibodies described in Section 12 below.Such antibodies may be used, for example, in the detection of a HKNG1,GNKH or TS gene product in an biological sample and may, therefore, beutilized as part of a diagnostic or prognostic technique wherebypatients may be tested for abnormal levels of HKNG1, GNKH or TS geneproducts, and/or for the presence of abnormal forms of such geneproducts. Such antibodies may also be utilized in conjunction with, forexample, compound screening schemes, as described, below, in Section5.6, for the evaluation of the effect of test compounds on HKNG1, GNKHand TS gene product levels and/or activity. Additionally, suchantibodies can be used in conjunction with the gene therapy techniquesdescribed, below, in Section 5.9.2 to, for example, evaluate the normaland/or engineered HKNG1, GNKH and/or TS-expressing cells prior to theirintroduction into the patient.

[0229] Anti-HKNG1, anti-GNKH or anti-TS gene product antibodies mayadditionally be used in methods for inhibiting abnormal HKNG1, GNKH andTS gene product activity. Thus, such antibodies may, therefore, beutilized as part of treatment methods for a neuropsychiatric disordermediated by HKNG1, GNKH and/or TS, such as BAD or schizophrenia.

[0230] For the production of antibodies against a HKNG1, GNKH and/or TSgene product, various host animals may be immunized by injection with aHKNG1, GNKH or TS gene product, or a portion thereof. Such host animalsmay include, but are not limited to rabbits, mice, and rats, to name buta few. Various adjuvants may be used to increase the immunologicalresponse, depending on the host species, including but not limited toFreund's (complete and incomplete), mineral gels such as aluminumhydroxide, surface active substances such as lysolecithin, pluronicpolyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanin,dinitrophenol, and potentially useful human adjuvants such as BCG(bacille Calmette-Guerin) and Corynebacterium parvum.

[0231] Polyclonal antibodies are heterogeneous populations of antibodymolecules derived from the sera of animals immunized with an antigen,such as a HKNG1, GNKH or TS gene product, or an antigenic functionalderivative thereof. For the production of polyclonal antibodies, hostanimals such as those described above, may be immunized by injectionwith HKNG1, GNKH or TS gene product supplemented with adjuvants as alsodescribed above.

[0232] Monoclonal antibodies, which are homogeneous populations ofantibodies to a particular antigen, may be obtained by any techniquethat provides for the production of antibody molecules by continuouscell lines in culture. These include, but are not limited to, thehybridoma technique of Kohler and Milstein, (1975, Nature 256:495-497;and U.S. Pat. No. 4,376,110), the human B-cell hybridoma technique(Kosbor et al., 1983, Immunology Today 4:72; Cole et al., 1983, Proc.Natl. Acad. Sci. USA 80:2026-2030), and the EBV-hybridoma technique(Cole et al., 1985, Monoclonal Antibodies And Cancer Therapy, Alan R.Liss, Inc., pp. 77-96). Such antibodies may be of any immunoglobulinclass including IgG, IgM, IgE, IgA, IgD and any subclass thereof. Thehybridoma producing the mAb of this invention may be cultivated in vitroor in vivo. Production of high titers of mAbs in vivo makes this thepresently preferred method of production.

[0233] In addition, techniques developed for the production of “chimericantibodies” (Morrison, et al., 1984, Proc. Natl. Acad. Sci.,81:6851-6855; Neuberger, et al., 1984, Nature 312:604-608; Takeda, etal., 1985, Nature, 314:452-454) by splicing the genes from a mouseantibody molecule of appropriate antigen specificity together with genesfrom a human antibody molecule of appropriate biological activity can beused. A chimeric antibody is a molecule in which different portions arederived from different animal species, such as those having a variableregion derived from a murine mAb and a human immunoglobulin constantregion. (See, e.g., Cabilly et al., U.S. Pat. No. 4,816,567; and Boss etal., U.S. Pat. No. 4,816397, which are incorporated herein by referencein their entirety.)

[0234] In addition, techniques have been developed for the production ofhumanized antibodies. (See, e.g., Queen, U.S. Pat. No. 5,585,089, whichis incorporated herein by reference in its entirety.) An immunoglobulinlight or heavy chain variable region consists of a “framework” regioninterrupted by three hypervariable regions, referred to ascomplementarity determining regions (CDRs). The extent of the frameworkregion and CDRs have been precisely defined (see, “Sequences of Proteinsof Immunological Interest”, Kabat, E. et al., U.S. Department of Healthand Human Services (1983)). Briefly, humanized antibodies are antibodymolecules from non-human species having one or more CDRs from thenon-human species and a framework region from a human immunoglobulinmolecule.

[0235] Alternatively, techniques described for the production of singlechain antibodies (U.S. Pat. No. 4,946,778; Bird, 1988, Science242:423-426; Huston, et al., 1988, Proc. Natl. Acad. Sci. USA85:5879-5883; and Ward, et al., 1989, Nature 334:544-546) can be adaptedto produce single chain antibodies against HKNG1, GNKH and TS geneproducts. Single chain antibodies are formed by linking the heavy andlight chain fragments of the Fv region via an amino acid bridge,resulting in a single chain polypeptide.

[0236] Antibody fragments that recognize specific epitopes may begenerated by known techniques. For example, such fragments include butare not limited to: the F(ab′)2 fragments, which can be produced bypepsin digestion of the antibody molecule and the Fab fragments, whichcan be generated, e.g., by digesting the antibody molecule with papainor by reducing the disulfide bridge of F(ab′)₂ fragments. Alternatively,Fab expression libraries may be constructed (Huse, et al., 1989, Science246:1275-1281) to allow rapid and easy identification of monoclonal Fabfragments with the desired specificity.

5.4. USES OF HKNG1, GNKH AND TS GENE SEQUENCES GENE PRODUCTS, ANDANTIBODIES

[0237] Described herein are various applications of the gene sequences,gene products (including peptide fragments and fusion proteins thereof)and antibodies of the present invention. In particular, among theapplications described herein are applications which use the HKNG1 genesequences, HKNG1 gene products (including HKNG1 peptide fragments andfusion proteins) described in Sections 5.1 and 5.2, above, as well asapplications which use antibodies directed against such HKNG1 geneproducts, peptide fragments and fusion proteins, as described, above, inSection 5.3. The applications described herein also include applicationswhich use the GNKH gene sequences, GNKH gene products (including GNKHpeptide fragments and fusion proteins) described in Section 5.1 and 5.2,above, as well as well as applications which use antibodies directedagainst such HKNG1 gene products, peptide fragments and fusion proteins,as described, above, in Section 5.3. The applications described hereinalso include applications which use the TS gene sequences, TS geneproducts (including TS peptide fragments and fusion proteins) describedin Section 5.1 and 5.2, above, as well as applications which useantibodies directed against such TS gene products, peptide fragments andfusion proteins, as described, above, in Section 5.3.

[0238] Such applications include, for example, mapping of humanchromosome 18p, prognostic and diagnostic evaluation of disordersmediated by or associated with HKNG1, GNKH and/or TS (includingCNS-related disorders, e.g., neuropsychiatric disorders such as BAD orschizophrenia), identification of individuals (e.g., human patients)with a predispositions to such disorders, and modulation of HKNG1, GNKHand/or TS-related processes. Such methods of diagnostic and prognosticevaluation are described, in detail, in Section 5.5, below.

[0239] Additionally, such applications include methods for the treatmentof disorders mediated by HKNG1, GNKH and/or TS, including CNS-relateddisorders such as, e.g., BAD or schizophrenia. Such methods aredescribed below, in detail, in Section 5.7. Further, screening methods,e.g., for identifying compounds that modulate the expression of a geneand/or the synthesis or activity of a gene product of the invention(e.g., a HKNG1, GNKH or TS gene or gene product), are described inSection 5.6, below. Compounds identified by such screening methods canbe used, e.g., in the therapeutic methods described in Section 5.7 andinclude, e.g., other cellular products that are involved in processessuch as mood regulation and in HKNG1, GNKH or TS-mediated disorders(e.g., neuropsychiatric disorders such as BAD or schizophrenia).

5.5. DIAGNOSIS OF DISORDERS ASSOCIATED WITH HKNG1, GNKH AND TS

[0240] A variety of methods can be employed for the diagnostic andprognostic evaluation of disorders associated with and/or mediated byone or more of the genes or gene products of the present invention(e.g., HKNG1-, GNKH- and TS-mediated disorders such as neuropsychiatricdisorders, including BAD and schizophrenia) as well as for theidentification of individual organisms (e.g., individual human patients)having a predisposition to such disorders. Such methods may, forexample, utilize reagents such as the nucleotide sequences described inSection 5.1 (i.e., HKNG1, GNKH and TS nucleotide sequences), the geneproducts described in Section 5.2 (i.e., HKNG1, GNKH and TS geneproducts) and antibodies directed against such gene products, includingantibodies directed against peptide fragments of such gene productsdescribed in Section 5.3 (i.e., antibodies directed against HKNG1, GNKHand TS peptide fragments). Specifically, such reagents may be used,e.g., for: (1) the detection of the presence of HKNG1 gene mutations, orthe detection of either over- or under-expression of an HKNG1 generelative to wild-type HKNG1 levels of expression; (2) the detection ofover- or under-abundance of a HKNG1 gene product relative to wild-typeabundance of HKNG1 gene product; and (3) the detection of an aberrantlevel of HKNG1 gene product activity relative to wild-type HKNG1 geneproduct activity levels.

[0241] Reagents such as those described above can also be used, e.g.,for: (1) the detection of the presence of GNKH gene mutations, or thedetection of either over- or under-expression of an GNKH gene relativeto wild-type GNKH levels of expression; (2) the detection of over- orunder-abundance of a GNKH gene product relative to wild-type abundanceof GNKH gene product; and (3) the detection of an aberrant level of GNKHgene product activity relative to wild-type GNKH gene product activitylevels.

[0242] Reagents such as those described above can also be used, e.g.,for: (1) the detection of the presence of TS gene mutations, or thedetection of either over- or under-expression of an TS gene relative towild-type TS levels of expression; (2) the detection of over- orunder-abundance of a TS gene product relative to wild-type abundance ofTS gene product; and (3) the detection of an aberrant level of TS geneproduct activity relative to wild-type TS gene product activity levels.

[0243] Taking, for example, the HKNG1 gene nucleotide sequences of thepresent invention, such sequences can be used to diagnose aHKNG1-mediated neuropsychiatric disorders using, for example, thetechniques for detecting HKNG1 mutations and polymorphisms described inSection 5.1.3, above, and in Section 5.5.1, below. Likewise, the GNKHgene nucleotide sequences of the invention, which are located in thesame region of human chromosome 18p as the HKNG1 gene, can also be usedto diagnose neuropsychiatric disorders using, e.g., the above-discussedtechniques to detect GNKH mutations and polymorphisms. Likewise, the TSgene nucleotide sequences of the invention, which are located in thesame region of human chromosome 18p as the TS gene, can also be used todiagnose neuropsychiatric disorders using, e.g., the above-discussedtechniques to detect TS mutations and polymorphisms. Mutations at anumber of different genetic loci of HKNG1, GNKH and/or TS may lead tophenotypes related a particular disorder or conditions such as aneuropsychiatric disorder (e.g., BAD or schizophrenia). Accordingly, thediagnostic and treatment methods of the invention are preferablydesigned to target the particular genetic loci containing the mutationor mutations mediating the disorders.

[0244] For example, genetic mutations and polymorphisms have been linkedto differences in drug effectiveness. In one, non-limiting embodiment ofthe present invention, therefore, alterations (i.e., polymorphisms) inthe HKNG1 are associated with the efficacy of one or more particulardrugs, including the tolerance or toxicity of the drugs to a patient. Insuch an embodiment, these mutations can be used in pharmacogenomicmethods to optimize therapeutic drug treatments, including therapeuticdrug treatments for one or more of the disorders described herein (e.g.,CNS disorders, such as schizophrenia and BAD). In another exemplary andnon-limiting embodiment of the invention, alterations (i.e.,polymorphisms) in the GNKH gene or gene product are associated with theefficacy of one or more particular drugs, including the tolerance ortoxicity of the drug to a patient. In another exemplary and non-limitingembodiment of the invention, alterations (i.e., polymorphisms) in the TSgene or gene product are associated with the efficacy of one or moreparticular drugs, including the tolerance or toxicity of the drug to apatient. These mutations can also be used in pharmacogenomic methods tooptimize therapeutic drug treatments (e.g., for one or more of thedisorders described herein, including CNS disorders such asschizophrenia and BAD).

[0245] Such polymorphisms in the HKNG, GNKH and/or TS genes can be used,for example, to refine the design of drugs by decreasing the incidenceof adverse events in drug tolerance studies, e.g., by identifyingpatient subpopulations of individuals who respond or do not respond to aparticular drug therapy in efficacy studies, wherein the subpopulationshave a HKNG1, GNKH or TS polymorphism associated with drugresponsiveness or unresponsiveness. The pharmacogenomic methods of thepresent invention can also provide tools to identify new drug targetsfor designing drugs and to optimize the use of already existing drugs,e.g., to increase the response rate to a drug and/or to identify andexclude non-responders from certain drug treatments (e.g., individualshaving a particular HKNG1, GNKH or TS polymorphism associated withunresponsiveness or inferior responsiveness to the drug treatment), todecrease the undesireable side effects of certain drug treatments and/orto identify and exclude individuals with marked susceptibility to suchside effects (e.g., individuals having a particular HKNG1, GNKH or TSpolymorphism associated with an undesirable side effect of a drugtreatment).

[0246] In other embodiments of the present invention, polymorphisms inan HKNG1 gene sequence or flanking sequences, or variations in HKNG1gene expression (including levels of an HKNG1 protein or an HKNG1messenger RNA) or activity (e.g., variations due to altered methylation,differential splicing, or post-translational modification such asproteolytic cleavage or glycosylation) may be utilized to identify anindividual having a disease or condition resulting from a disorderassociation with or mediated by HKNG1. Likewise, in other embodiments ofthe invention, polymorphisms in a GNKH gene sequence or flankingsequences, or variations in GNKH gene expression (including levels of aGNKH protein or a GNKH messenger RNA) or activity (e.g., variations dueto altered methylation, differential splicing, or post-translationalmodification such as proteolytic cleavage or glycosylation) may beutilized to identify an individual having a disease or conditionresulting from a disorder associated with or mediated by GNKH. Likewise,in other embodiments of the invention, polymorphisms in a TS genesequence or flanking sequences, or variations in TS gene expression(including levels of a TS protein or a TS messenger RNA) or activity(e.g., variations due to altered methylation, differential splicing, orpost-translational modification such as proteolytic cleavage orglycosylation) may be utilized to identify an individual having adisease or condition resulting from a disorder associated with ormediated by TS. Once a polymorphism in an HKNG1, GNKH or TS gene, or ina flanking sequence in linkage disequilibrium with a disorder-causingallele of a HKNG1, GNKH or TS gene, or a variation in HKNG1, GNKH or TSgene expression or activity has been identified in an individual, anappropriate treatment (e.g., an appropriate drug therapy) can beprescribed to the individual.

[0247] Nucleic acid-based detection techniques which may be used todetect such genetic variations (e.g., mutations and/or polymorphisms) ina HKNG1, GNKH and/or TS gene are described, below, in Section 5.5.1.Peptide detection techniques are described, below, in Section 5.5.2. Aswill be apparent to one of skill in the art, for the detection of HKNG1gene mutations or polymorphisms, any nucleated cell can be used as astarting source for genomic nucleic acid. For the detection of HKNG1gene expression or HKNG1 gene products, any cell type or tissue in whichthe HKNG1 gene is expressed may be utilized. Likewise, for the detectionof GNKH gene expression or GNKH gene products, any cell type or tissuein which the GNKH gene is expressed may be utilized. Likewise, for thedetection of TS gene expression or TS gene products, any cell type ortissue in which the TS gene is expressed may be utilized.

[0248] In preferred embodiments, such diagnostic and prognostic methodsare performed utilizing prepackaged diagnostic kits. Accordingly, kitsfor detecting the presence of a polypeptide or nucleic acid of theinvention (e.g., a HKNG1 polypeptide or nucleic acid, a GNKH polypeptideor nucleic acid a TS polypeptide or nucleic acid) in a biological sample(e.g., in a test sample) are also provided in the present invention.Such kits can be used, e.g., to determine if a subject is suffering fromor is at increased risk of developing a disorder associated with adisorder-causing allele of a gene of the invention (e.g., of a HKNG1,GNKH or TS gene) or aberrant expression or activity of a polypeptide ofthe invention. For example, the kits of the invention can be used toidentify individuals who suffer from or are at increased risk ofdeveloping a CNS disorder, including a neuropsychiatric disorder such asBAD or schizophrenia, that is associated with a disorder-causing alleleor aberrant expression or activity of a gene or gene product (e.g., aHKNG1, GNKH or TS gene or gene product) of the invention.

[0249] As an example, and not by way of limitation, such a kit cancomprise a labeled compound or agent capable of detecting a HKNG1, GNKHor TS polypeptide, or HKNG1, GNKH or TS gene sequences (e.g. DNA or mRNAmolecules comprising HKNG1, GNKH or TS nucleotide sequences) in abiological sample. The kit can further comprise a means for determiningthe amount of the polypeptide, mRNA or DNA in the sample, such as anantibody which specifically binds to the polypeptide or anoligonucleotide probe which is complementary to, and therefore capableof hybridizing to, DNA and/or mRNA molecules that encode thepolypeptide. A kit of the invention can also include instructions forobserving that the tested subject is suffering from or is at risk ofdeveloping a disorder associated, e.g., with aberrant expression of thepolypeptide if the amount of the polypeptide or of mRNA encoding thepolypeptide is above or below a normal value or, more generally, aboveor below a normal range of values. Alternatively, the kit can includeinstruction for observing that the tested subject is suffering from oris at risk of developing a disorder if the mRNA or DNA detected in thesample correlates with a HKNG1, GNKH or TS allele that causes or isassociated with a disorder.

[0250] In more detail, for antibody-based kits, a kit can comprise, forexample: (1) a first antibody (e.g., attached to a solid surface orsupport) which binds to a polypeptide of the invention (e.g., to aHKNG1, GNKH or TS polypeptide); and, optionally, (2) a second, differentantibody which binds to either the polypeptide or the first antibody andis conjugated to a detectable agent. For oligonucleotide kits, a kit cancomprise, for example: (1) an oligonucleotide (e.g., a detectablylabeled oligonucleotide) which hybridizes to a nucleic acid sequenceencoding a polypeptide of the invention (e.g., to a nucleic acidsequence encoding a HKNG1, GNKH, or a TS polypeptide); or (2) a pair ofprimers, such as that primers recited in Table 1, below, that can beused to amplify (e.g., by PCR) a nucleic acid molecules encoding apolypeptide of the invention.

[0251] The kits of the invention can further comprise, for example, oneor more buffering agents, preservatives or protein stabilizing agents.The kits can also comprise additional components necessary and/or usefulfor detecting the detectable agent (e.g., an enzyme or a substrate). Thekit can still further contain a control sample or a series of controlsample which can be assayed and compared to the test sample. Eachcomponent of the kit is usually enclosed within an individual container,and all of the various containers are typically within a single packagealong with instructions for observing whether a tested subject issuffering from or is at risk of developing a disorder associated, e.g.,with polymorphisms that correlate with alleles that cause a HKNG1-,GNKH- and/or TS-related disorder, with aberrant levels of HKNG1, GNKH orTS mRNA, with aberrant levels of HKNG1, GNKH or TS polypeptides, or withaberrant HKNG1, GNKH or TS activity.

5.5.1. DETECTION OF NUCLEIC ACID MOLECULES

[0252] Portions or fragments of the cDNA genomic sequences describedherein have many useful applications as polynucleotide reagents. Forexample, these sequence can be used to: (i) screen for HKNG1, GNKHand/or TS gene-specific mutations or polymorphisms, (ii) map theirrespective genes (including HKNG1, GNKH and/or TS homologs and orthologsexpressed in other species) on a chromosome and, thus, locate generegions associated with genetic disease including regions associatedwith neuropsychiatric disorders such as BAD; (iii) identify individualsfrom a minute biological sample (tissue typing); and (iv) aid inforensic identification of a biological sample. These applications aredescribed, in detail, in the subsections below.

[0253] Detection of Mutations and Polymorphisms:

[0254] A variety of methods can be employed to screen for the presenceof mutations or polymorphisms that are specific to the HKNG1, GNKH andTS genes of the invention, including polymorphisms flanking the HKNG1,GNKH or TS gene, and to detect and/or assay levels of HKNG1, GNKH or TSnucleic acid sequences in a sample.

[0255] Mutations or polymorphisms within or flanking a HKNG1, GNKH or TSgene can be detected by utilizing a number of techniques that are knownin the art. Nucleic acid from any nucleated cell can by isolatedaccording to standard nucleic acid preparation procedures that are wellknown to those of skill in the art and as the starting point for suchassay techniques.

[0256] As an example, HKNG1, GNKH and TS nucleic acid sequences can beused in hybridization or amplification assays of biological sample todetect abnormalities involving HKNG1, GNKH or TS gene structure,including, for example, point mutations, insertions, deletions,inversions, translocations and chromosomal rearrangements. Exemplaryassays include, but are not limited to, Southern analyses, singlestranded conformational polymorphism analyses (SSCP) and PCR analyses.

[0257] Diagnostic methods for the detection of gene-specific mutationsor polymorphisms (e.g., mutations or polymorphisms that are specific tothe HKNG1 gene, the GNKH gene, or the TS gene) can involve, for example,contacting and incubating nucleic acids obtained from a sample (e.g.,derived from a patient sample or from another appropriate cellularsource) with one or more labeled nucleic acid reagents (including, forexample, recombinant DNA molecules, cloned genes or degenerate variantsthereof as described in Section 5. 1, above) under conditions favorablefor the specific annealing of these reagents to their complementarysequences within or flanking the HKNG1, GNKH or TS gene. The diagnosticmethods of the present invention further encompass contacting andincubating nucleic acids for the detection of single ncleotide mutationsor polymorphisms of the HKNG1, GNKH or TS gene. Preferably, the nucleicacid reagent sequences are sequences within the HKNG1, GNKH or TS gene,or, alternatively, are chromosome 18p nucleotide sequences (e.g., humanchromosome 18p nucleotide sequences) flanking the HKNG1, GNKH or TSgene. Preferably, the nucleic acid reagent sequences are 15 to 30nucleotides in length.

[0258] After incubation, all non-hybridized nucleic acids are removedand the presence of nucleic acids that have hybridized, if any suchmolecules exist, is then detected. Using such a detection scheme, thenucleic acid from the cell type or tissue of interest can beimmobilized, e.g., to a solid support such as a membrane, a plasticesurface (e.g., on a microtiter plate or polystyrene beads) or a glasssurface such as on a glass slide or plate. In such embodiments,non-hybridized, labeled nucleic acid reagents of-the type described inSection 5.1, above, are easily removed after incubation. Detection ofthe remaining, hybridized nucleic acid reagents is then accomplishedusing standard techniques well-known in the art. The HKNG1, GNKH or TSgene sequences to which the nucleic acid reagents have annealed can thenbe compared, e.g., to the annealing pattern expected from a normalHKNG1, GNKH or TS gene sequence in order to determine whether a HKNG1,GNKH or TS gene mutation is present. In a particularly preferredembodiment, mutations or polymorphisms specific to a HKNG1, GNKH or TSgene (including mutations or polymorphisms flanking a HKNG1, GNKH or TSgene) can be detected using a microassay of HKNG1, GNKH or TS nucleicacid sequences immobilized to a substrate or “gene chip” (see, e.g.,Cronin et al., 1996, Human Mutation 7:244-255).

[0259] Alternative diagnostic methods for the detection of HKNG1, GNKHor TS gene-specific nucleic acid molecules (or of sequences flanking aHKNG1, GNKH or TS gene) in patient samples or in other appropriate cellsources may involve their amplification, e.g., by PCR (see, e.g., theexperimental embodiment set forth in Mullis, 1987, U.S. Pat. No.4,683,202), followed by the analysis of the amplified molecules usingtechniques well known to those of skill in the art including, forexample, those techniques described hereinabove. The resulting amplifiedsequences can be compared to those that would be expected, e.g., if thenucleic acid being amplified contained only normal copies of a HKNG1,GNKH or TS gene, in order to determine whether a mutation orpolymorphism of the HKNG1, GNKH or TS is present in the sample.

[0260] Among those nucleic acid sequences which are preferred for suchamplification-related diagnostic screening analyses are oligonucleotideprimers which amplify HKNG1, GNKH or TS exon sequences. The sequences ofsuch oligonucleotide primers are preferably derived from intronsequences so that the entire exon (i.e., the entire coding region of aHKNG1, GNKH or TS gene) can be analyzed as discussed below. Preferably,primer pairs used for amplification of exons are derived from adjacentintrons. For example, in those embodiments wherein one or more exons ofthe HKNG1 gene of the invention are to be amplified, appropriate primerpairs can be chosen such that each of the thirteen HKNG1 exons in SEQ IDNO:7, including the Exons referred to as Exons 2′ and Exon 2″,respectively, are amplified. In particular, primers for theamplification of HKNG1 exons can be routinely designed by one ofordinary skill in the art using the exon and intron sequences of HKNG1shown, e.g., in FIG. 3A 3A-28 (SEQ ID NO:7). Likewise, appropriateprimer pairs can also be chosen for amplifying each of the GNKH exons.Indeed, such primers can also be routinely designed by one of ordinaryskill in the art by utilizing the exon and intron sequences of GNKHshown, e.g., in FIGS. 30A-B (SEQ ID NO: 124). Likewise, appropriateprimer pairs can also be chosen for amplifying each of the TS exons.Indeed, such primers can also be routinely designed by one of ordinaryskill in the art by utilizing the exon and intron sequences of TS shown,e.g., in FIGS. 44A-G (SEQ ID NO:140).

[0261] As an example, and not by way of limitation, Table 1, below,lists primers and primer pairs which can be utilized for theamplification of each of the human HKGN1 exons one through eleven. Inthis table, a primer pair is listed for each exon which consists of aforward primer derived from intron sequence upstream of the exon to beamplified, and a reverse primer derived from intron sequence downstreamof the exon to be amplified. For exons greater than about 300 base pairsin length, i.e., exons 4 and 7, two primer pairs are listed (marked 4a,4b, 7a and 7b). Each of the primer pairs can be utilized, therefore, aspart of a standard PCR reaction to amplify an individual HKNG1 exon (orportion thereof). Primer sequences are depicted in a 5′ to 3′orientation. TABLE 1 Primer Sequence  1 Cggggttggtttccacc (SEQ ID NO:8)forward Gcgaggagagaaatctggg (SEQ ID NO:9) reverse  2Tgctcactactttgcagtgttc (SEQ ID NO:10) forward Tgagatcgtgtcactgcattct(SEQ ID NO:11) reverse  2′ gtcatgcttttatacattc (SEQ ID NO:14) forwardGgacaaccaacatgcaaacag (SEQ ID NO:15) reverse  4B Cccaggtgttttcaattgatgc(SEQ ID NO:16) foward Agcagttttgtccttccaagtg (SEQ ID NO:17) reverse  5gtgttttgtaatctgatcagatctc (SEQ ID NO:18) forward gcagtatttctggtccagatc(SEQ ID NO:19) reverse  6 ggtgcacatagatcatgaaatgg (SEQ ID NO:20) forwardtaagctgaaataggtgccttaag (SEQ ID NO:21) reverse  7Atttattccatttctgtcccctac (SEQ ID NO:22) forward aaggctcagttaggtctgtatc(SEQ ID NO:23) reverse  7B caggagttttaacgtcttcagac (SEQ ID NO:24)forward gactcagaaatgtctaccatttc (SEQ ID NO:25) reverse  8tgtctccacttcttcaaagtgc (SEQ ID NO:26) forward caaaatgtacctgagaacttaaag(SEQ ID NO:27) reverse  9 cacctccaagtttcatggac (SEQ ID NO:28) forwardcaaggtatgcacgtgtcatttc (SEQ ID NO:29) reverse 10gaatgtgtattgggatttagtaaac (SEQ ID NO:30) forwardttgagaattaactattcctgtcaac (SEQ ID NO:31) reverse 10′gaattagacgaggcgatcag forward acttactggatataggatgc reverse 11ccatcctggacttttactcc (SEQ ID NO:32) forward ctttcctgcaactgtgtttattg (SEQID NO:33) reverse

[0262] Each primer pair in Table 1, above, can be used to generate anamplified sequence of about 300 base pairs. This is especially desirablein instances in which sequence analysis is performed using SSCP gelelectrophoretic procedures, in that such procedures work optimally usingsequences of about 300 base pairs or less. These primer sets are alsoused extensively for direct sequencing of the PCR product for mutations.

[0263] Additional nucleic acid sequences which are preferred for suchamplification-related analyses are those which will detect the presenceof an HKNG1 polymorphism which differs from the HKNG1 sequence depictedin FIG. 3A-3A-28 (SEQ ID NO:7), those nucleic acid sequences which willdetect the presence of a GNKH polymorphism which differs from the GNKHsequence depicted in FIGS. 30A-30B (SEQ ID NO: 124) or are those nucleicacid sequences which will detect the presence of a TS polymorphism whichdiffers from the TS sequence depicted in FIG. 44A-G (SEQ ID NO:140).Such polymorphisms include ones which represent mutations associatedwith a neuropsychiatric disorder, such as BAD or schizophrenia, that isassociated with or mediated by HKNG1, GNKH or TS. For example, a singlebase mutation identified in the Example presented in Section 8, below,results in a mutant HKNG1 gene product comprising substitution of alysine residue for the wild-type glutamic acid residue at amino acidposition 202 of the HKNG1 amino acid sequence shown in FIG. 1-1C (SEQ IDNO:2) or amino acid position 184 of the HKNG1 amino acid sequence shownin FIG. 2A-2C (SEQ ID NO:4). Such polymorphisms also include ones thatcorrelate with the presence of a neuropsychiatric disorder associatedwith and/or mediated by HKNG1, GNKH or TS, e.g., polymorphisms that arein linkage disequilibrium with disorder-causing alleles of the HKNG1,GNKH or TS genes.

[0264] Amplification techniques are well known to those of skill in theart and can routinely be utilized in connection with primers such asthose listed in Table 1 above. In general, hybridization conditions canbe as follows: in general, for probes between 14 and 70 nucleotides inlength the melting temperature Tm is calculated using the formula: Tm(°C.)=81.5+16.6(log[monovalent cations])+0.4 1 (% G+C)−(500/N) where N isthe length of the probe. If the hybridization is carried out in asolution containing formamide, the melting temperature is calculatedusing the equation Tm(° C.)=81.5+16.6(log[monovalent cations])+0.41(%G+C)−0.61 (% formamide)−(500/N) where N is the length of the probe.

[0265] Additionally, well-known genotyping techniques can be performedto identify individuals carrying HKNG1, GNKH or TS gene mutations. Suchtechniques include, for example, the use of restriction fragment lengthpolymorphisms (RFLPs), which involve sequence variations in one of therecognition sites for the specific restriction enzyme used.

[0266] Further, improved methods for analyzing DNA polymorphisms, whichcan be utilized for the identification of HKNG1, GNKH or TSgene-specific mutations, have been described that capitalize on thepresence of variable numbers of short, tandemly repeated DNA sequencesbetween the restriction enzyme sites. For example, Weber (U.S. Pat. No.5,075,217) describes a DNA marker based on length polymorphisms inblocks of (dC-dA)n-(dG-dT)n short tandem repeats. The average separationof (dC-dA)n-(dG-dT)n blocks is estimated to be 30,000-60,000 bp. Markersthat are so closely spaced exhibit a high frequency co-inheritance, andare extremely useful in the identification of genetic mutations, suchas, for example, mutations within the HKNG1, GNKH or TS gene, and thediagnosis of diseases and disorders related to HKNG1, GNKH or TSmutations.

[0267] Caskey et al. (U.S. Pat. No. 5,364,759) describe a DNA profilingassay for detecting short tri and tetra nucleotide repeat sequences. Theprocess includes extracting the DNA of interest, such as the HKNG1 geneor a fragment thereof, the GNKH gene or a fragment, or the TS gene or afragment, amplifying the extracted DNA, and labeling the repeatsequences to form a genotypic map of the individual's DNA.

[0268] Other methods well known in the art may be used to identifysingle nucleotide polymorphisms (SNPs), including biallelic SNPs orbiallelic markers which have two alleles, both of which are present at afairly high frequency in a population. Conventional techniques fordetecting SNPs include, e.g., conventional dot blot analysis, singlestranded conformational polymorphism (SSCP) analysis (see, e.g., Oritaet al., 1989, Proc. Natl. Acad. Sci. USA 86:2766-2770), denaturinggradient gel electrophoresis (DGGE), heterodulex analysis, mismatchcleavage detection, and other routine techniques well known in the art(see, e.g., Sheffield et al., 1989, Proc. Natl. Acad. Sci. 86:5855-5892;Grompe, 1993, Nature Genetics 5:111-117). Alternative, preferred methodsof detecting and mapping SNPs involve microsequencing techniques whereina SNP site in a target DNA is detecting by a single nucleotide primerextension reaction (see, e.g., Goelet et al., PCT Publication No.WO92/15712; Mundy, U.S. Pat. No 4,656,127; Vary and Diamond, U.S. Pat.No. 4,851,331; Cohen et al., PCT Publication No. WO91/02087; Chee etal., PCT Publication No. WO95/11995; Landegren et al., 1988, Science241:1077-1080; Nicerson et al., 1990, Proc. Natl. Acad. Sci. U.S.A.87:8923-8927; Pastinen et al.,1997, Genome Res. 7:606-614; Pastinen etal., 1996, Clin. Chem. 42:1391-1397; Jalanko et al., 1992, Clin. Chem.38:39-43; Shumaker et al., 1996, Hum. Mutation 7:346-354; Caskey et al.,PCT Publication No. WO 95/00669).

[0269] Levels of HKNG1, GNKH and/or TS gene expression can also beassayed. For example, RNA from a cell type or tissue known, orsuspected, to express the HKNG1, the GNKH or the TS gene, such as brain,may be isolated and tested utilizing hybridization or PCR techniquessuch as are described, above and in the Example presented in Section 19,below. The isolated cells can be derived, e.g., from cell culture orfrom a patient. For example, the analysis of cells taken from culturemay be a necessary step in the assessment of cells to be used as part ofa cell-based gene therapy technique or, alternatively, to test theeffect of compounds on the expression of the HKNG1, GNKH or TS gene.Such analyses may reveal both quantitative and qualitative aspects ofthe expression pattern of a gene (e.g., the HKNG1, GNKH or TS gene),including activation or inactivation of gene expression.

[0270] In one embodiment of such a detection scheme, a cDNA molecule issynthesized from an RNA molecule of interest (e.g., by reversetranscription of the RNA molecule into cDNA). A sequence within the cDNAis then used as the template for a nucleic acid amplification reaction,such as a PCR amplification reaction, or the like. The nucleic acidreagents used as synthesis initiation reagents (e.g., primers) in thereverse transcription and nucleic acid amplification steps of thismethod are chosen from among the HKNG1, GNKH and TS gene nucleic acidreagents described in Section 5.1. Preferred lengths of such nucleicacid reagents are at least 9-30 nucleotides. For detection of theamplified product, the nucleic acid amplification may be performed usingradioactively or non-radioactively labeled nucleotides. Alternatively,enough amplified product may be made such that the product may bevisualized by standard ethidium bromide staining or by utilizing anyother suitable nucleic acid staining method.

[0271] Additionally, it is possible to perform such gene expressionassays “in situ”, i.e., directly upon tissue sections (fixed and/orfrozen) of patient tissue obtained from biopsies or resections, suchthat no nucleic acid purification is necessary. Nucleic acid reagentssuch as those described in Section 5.1 may be used as probes and/orprimers for such in situ procedures (see, for example, Nuovo, G. J.,1992, “PCR In Situ Hybridization: Protocols And Applications”, RavenPress, NY).

[0272] Alternatively, if a sufficient quantity of the appropriate cellscan be obtained, standard Northern analysis can be performed todetermine the level of mRNA expression of the HKNG1, the GNKH or the TSgene.

[0273] Chromosome Mapping:

[0274] Once the sequence (or a portion of the sequence) of a gene hasbeen isolated, the isolated sequence can be used to map the location ofthe genes on a chromosome. Genes which can be mapped using the isolatedsequence include, not only the gene corresponding to the isolatedsequence itself, but also homologs and orthologs of that gene.Accordingly, the nucleic acid molecules described herein and fragmentsthereof can be used to map the location of corresponding genes,including homologs and orthologs of those genes, on a chromosome. Themapping of the sequence to chromosomes is an important first step incorrelating these sequences with genes associated with disease.

[0275] Briefly, genes can be mapped to chromosomes using techniques wellknown to those skilled in the art, including, e.g., preparation of PCRprimers (preferably 15-25 bp in length) from the sequence of a gene ofthe invention. Computer analysis of the sequence of a gene of theinvention can be used to rapidly select primers that do not span morethan one exon in the genomic DNA, thus complicating the amplificationprocess. These primers can then be used for PCR screening of somaticcell hybrids containing individual human chromosomes. Only those hybridscontaining the human gene corresponding to the gene sequences will yieldan amplified fragment. For a review of this technique, see D'Eustachioet al. (1983, Science 220:919-924).

[0276] PCR mapping of somatic cell hybrids is a rapid procedure forassigning a particular sequence to a particular chromosome. Three ormore sequences can be assigned per day using a single thermal cycler.Using the nucleic acid sequences of the invention to designoligonucleotide primers, sublocalization can be achieved with panels offragments from specific chromosomes. Other mapping strategies which cansimilarly be used to map a gene to its chromosome include in situhybridization (described in Fan et al., 1990, Proc. Natl. Acad. Sci.U.S.A. 87:6223-6227), pre-screening with labeled flow-sorted chromosomes(CITE) and pre-selection by hybridization (FISH) of a DNA sequence to ametaphase chromosomal spread can further be used to provide a precisechromosomal location in one step (for a review, see Verma et al., 1988,Human Chromosomes: A Manual of Basic Techniques, Pergamon Press, NewYork).

[0277] Reagents for chromosome mapping can be used individually to marka single chromosome or a single site on that chromosome or panels ofreagents can be used for marking multiple sites and/or multiplechromosomes. Reagents corresponding to noncoding regions of the genesactually are preferred for mapping purposes. Coding sequences are morelikely to be conserved within gene families, thus increasing the chanceof cross hybridizations during chromosomal mapping.

[0278] Once a sequence has been mapped to a precise chromosomallocation, the physical position of the sequence on the chromosome can becorrelated with genetic map data which can be found, e.g., in V.McKusick, Mendelian Inheritance in Man, available on line through JohnsHopkins University Welch Medical Library). The relationship betweengenes and disease, mapped to the same chromosomal region, can then beidentified through linkage analysis (co-inheritance of physicallyadjacent genes), described, e.g., in Egeland et al., 1987, Nature325:783-787.

[0279] Moreover, differences in the DNA sequences between individualsaffected and unaffected with a disease associated with a gene of theinvention can be determined. If a mutation is observed in some or all ofthe affected individuals but not in any unaffected individuals, then themutation is likely to be the causative agent of the particular disease.Comparison of affected and unaffected individuals generally involvedfirst looking for structural alterations in the chromosomes, such asdeletions or translocations, that are visible from chromosome spreads ordetectable using PCR based on that DNA sequence. Ultimately, completesequencing of genes from several individuals can be performed to confirmthe presence of a mutation and to distinguish mutations frompolymorphisms.

[0280] Furthermore, the nucleic acid sequences disclosed herein can beused to perform searches against “mapping databases”, e.g., BLAST-typesearch, such that the chromosome position of the gene is identified bysequence homology or identity with known sequence fragments which havebeen mapped to chromosomes.

[0281] A polypeptide and fragments and sequences thereof and antibodiesspecific thereto can be used to map the location of the gene encodingthe polypeptide on a chromosome. This mapping can be carried out byspecifically detecting the presence of the polypeptide in members of apanel of somatic cell hybrids between cells of a first species of animalfrom which the protein originates and cells from a second species ofanimal and then determining which somatic cell hybrid(s) expresses thepolypeptide and noting the chromosome(s) from the first species ofanimal that it contains. For examples of this technique, see Pajunen etal. (1988) Cytogenet. Cell Genet. 47:37-41 and Van Keuren et al. (1986)Hum. Genet. 74:34-40. Alternatively, the presence of the polypeptide inthe somatic cell hybrids can be determined by assaying an activity orproperty of the polypeptide, for example, enzymatic activity, asdescribed in Bordelon-Riser et al. (1979) Somatic Cell Genetics5:597-613 and Owerbach et al. (1978) Proc. Natl. Acad. Sci. USA75:5640-5644.

[0282] Tissue Typing:

[0283] The nucleic acid sequences of the present invention can also beused to identify individuals from minute biological samples. Forexample, the United States military is considering the use ofrestriction fragment length polymorphism (RFLP) for identification ofits personnel. In this technique, an individual's genomic DNA isdigested with one or more restriction enzymes and probed on a Southernblot to yield unique bands for identification. This method does notsuffer from the current limitations of “Dog Tags” which can be lost,switched or stolen, making positive identification difficult. Thesequences of the present invention are useful as additional DNA markersfor RFLP, which is described in U.S. Pat. No. 5,272,057.

[0284] Furthermore, the sequences of the present invention can be usedto provide an alternative technique which determines the actualbase-by-base DNA sequence of selected portions of an individual'sgenome. Thus, the nucleic acid sequences described herein can be used toprepare two PCR primers from the 5′ and 3′ ends of the sequences. Thesesequences can then be used to amplify an individual's DNA andsubsequently sequence it.

[0285] Panels of corresponding DNA sequences from individuals, preparedin this manner, can provide unique individual identifications as eachindividual will have a unique set of such DNA sequences due to allelicdifferences. The sequences of the present invention can be used toobtain such identification sequences from individuals and from tissue.The nucleic acid sequences of the invention uniquely represent portionsof the human genome. Allelic variation occurs to some degree in thecoding regions of these sequences and, to a greater degree, in thenoncoding regions. It is estimated that allelic variation betweenindividual humans occurs with a frequency of about once per each 500bases. Each of the sequence described herein can, therefore, be used asa standard. Because greater numbers of polymorphisms occur in thenoncoding regions, fewer sequences are necessary to differentiateindividuals. The noncoding (e.g., the 5′- and 3′-UTR and intronicsequences) of HKNG1, GNKH and TS can comfortably provide positiveindividual identification with a panel of perhaps 10 to 1,000 primerswhich each yield a noncoding amplified sequence of 100 bases. Ifpredicted coding sequences, such as HKNG1, GNKH and/or TS exonsequences, are used, a more appropriate number of primers for positiveindividual identification would be 500 to 2,000.

[0286] If a panel of reagents from the nucleic acid sequences describedherein is used to generate a unique identification database for anindividual, those same reagents can later be used to identify tissuefrom that individual. Using the unique identification database, positiveidentification of the individual, living or dead, can be made fromextremely small tissue samples.

[0287] Use of Partial Gene Sequences in Forensic Biology:

[0288] DNA-based identification techniques can also be used in forensicbiology. Forensic biology is a scientific field employing genetic typingof biological evidence found at a crime scene as a means for positivelyidentifying, for example, a perpetrator of a crime. To make such anidentification, PCR technology can be used to amplify DNA sequencestaken from very small biological samples such as tissue sample,including, for example, samples of hair, skin or body fluids (e.g.,blood, saliva or semen) found at a crime scene. The amplified sequencescan then be compared to a standard, thereby allowing identification ofthe origin of the biological sample.

[0289] The sequences of the present invention can be used to providepolynucleotide reagents, e.g., PCR primers, targeted to specific loci inthe human genome, which can enhance the reliability of DNA-basedforensic identifications by, for example, providing another“identification marker” (i.e., another DNA sequence that is unique to aparticular individual). As mentioned above, actual base sequenceinformation can be used for identification as an accurate alternative topatterns formed by restriction enzyme generated fragments. Sequencestargeted to noncoding regions are particularly appropriate for this useas greater numbers of polymorphisms occur in the noncoding regions,making it easier to differentiate individuals using this technique.Examples of polynucleotide reagents include the HKNG1, GNKH and TSnucleic acid sequences of the invention as well as portions thereof,e.g., fragments derived from noncoding regions having a length of atleast 20 or 30 bases, including, for example, the HKNG1 primer sequencesprovided in Table 1, above.

[0290] The nucleic acid sequences described herein can further be usedto provide polynucleotide reagents, e.g., labeled or labelable probeswhich can be used in, for example, an in situ hybridization technique,to identify a specific tissue (e.g., brain tissue). This can be veryuseful in cases where a forensic pathologist is presented with a tissueof unknown origin. Panels of such probes can be used to identify tissueby species and/or by organ type.

[0291] Predictive Medicine

[0292] The present invention also pertains to the field of predictivemedicine in which diagnostic assays, prognostic assays, and monitoringclinical trials are used for prognostic (predictive) purposes to therebytreat an individual prophylactically. Accordingly, one aspect of thepresent invention relates to diagnostic assays for determining HKNG1,GNKH and/or TS activity, in the context of a biological sample (e.g.,blood, serum, cells, tissue) to thereby determine whether an individualis afflicted with a disease or disorder, or is at risk of developing adisorder, associated with aberrant or unwanted HKNG1, GNKH and/or TSexpression or activity. The invention also provides for prognostic (orpredictive) assays for determining whether an individual is at risk ofdeveloping a disorder associated with HKNG1, GNKH and/or TS protein,nucleic acid expression or activity. For example, mutations in a HKNG1,GNKH and/or TS gene can be assayed in a biological sample. Such assayscan be used for prognostic or predictive purpose to therebyprophylactically treat an individual prior to the onset of a disordercharacterized by or associated with HKNG1, GNKH and/or TS protein,nucleic acid expression or activity.

[0293] As an alternative to making determinations based on the absoluteexpression level of selected genes, determinations may be based on thenormalized expression levels of these genes. Expression levels arenormalized by correcting the absolute expression level of a HKNG1, GNKHand/or TS gene by comparing its expression to the expression of a genethat is not a HKNG1, GNKH and/or TS gene, e.g., a housekeeping gene thatis constitutively expressed. Suitable genes for normalization includehousekeeping genes such as the actin gene. This normalization allows thecomparison of the expression level in one sample, e.g., a patientsample, to another sample, e.g., a non-disease sample, or betweensamples from different sources.

[0294] Alternatively, the expression level can be provided as a relativeexpression level. To determine a relative expression level of a gene,the level of expression of the gene is determined for 10 or more samplesof different cell isolates, preferably 50 or more samples, prior to thedetermination of the expression level for the sample in question. Thecell isolates are selected depending upon the tissues in which the geneof interest is expressed. The mean expression level of each of the genesassayed in the larger number of samples is determined and this is usedas a baseline expression level for the gene(s) in question. Theexpression level of the gene determined for the test sample (absolutelevel of expression) is then divided by the mean expression valueobtained for that gene. This provides a relative expression level andaids in identifying extreme cases of HKNG1, GNKH and/or TS-mediateddisease.

[0295] Preferably, the samples used in the baseline determination willbe from HKNG1, GNKH and/or TS-mediated diseased or from non-diseasedcells of tissue. The choice of the cell source is dependent on the useof the relative expression level. Using expression found in normaltissues as a mean expression score aids in validating whether the HKNG1,GNKH and/or TS gene assayed is cell-type specific for the tissues inwhich expression is observed versus the expression found in normalcells. Such a use is particularly important in identifying whether aHKNG1, GNKH and/or TS gene can serve as a target gene. In addition, asmore data is accumulated, the mean expression value can be revised,providing improved relative expression values based on accumulated data.

[0296] Another aspect of the invention pertains to monitoring theinfluence of agents (e.g., drugs, compounds) on the expression oractivity of HKNG1, GNKH and/or TS in clinical trials.

5.5.2. DETECTION OF GENE PRODUCTS

[0297] Antibodies directed against unimpaired or mutant gene products ofthe invention (e.g., the HKNG1, GNKH or TS gene products described inSection 5.2, above) or conserved variants or peptide fragments thereofmay also be used as diagnostics and prognostics for disorders such asneuropsychiatric disorders, e.g., BAD or schizophrenia, that areassociated with or mediated by HKNG1, GNKH or TS. Such antibodies aredescribed, in detail, in Section 5.3, above. Such methods may be used,e.g., to detect abnormalities in the level of HKNG1, GNKH or TS geneproduct synthesis or expression, or abnormalities in the structure,temporal expression, and/or physical location of a HKNG1, GNKH or TSgene product (e.g., the expression or location of a HKNG1, GNKH or TSgene product in a cell or tissue). The antibodies and immunoassaymethods described herein have, for example, important in vitroapplications in assessing the efficacy of treatments for disordersassociated with or mediated by a HKNG1, GNKH or TS gene product. Forexample, antibodies, or fragments of antibodies, such as those describedbelow, may be used to screen potentially therapeutic compounds in vitroto determine their effects on HKNG1, GNKH or TS gene expression and/orHKNG1, GNKH or TS gene product production.

[0298] In vitro immunoassays may also be used, for example, to assessthe efficacy of cell-based gene therapy for a disorder mediated byHKNG1, GNKH or TS (e.g., a neuropsychiatric disorder, such as BADschizophrenia). Antibodies directed against HKNG1, GNKH or TS geneproducts may be used in vitro to determine, for example, the level ofHKNG1, GNKH or TS gene expression achieved in cells geneticallyengineered to produce HKNG1, GNKH or TS gene product. In the case ofintracellular HKNG1, GNKH or TS gene products, such an assessment isdone, preferably, using cell lysates or extracts. Such analysis willallow for a determination of the number of transformed cells necessaryto achieve therapeutic efficacy in vivo, as well as optimization of thegene replacement protocol.

[0299] The tissue or cell type to be analyzed will generally includethose that are known, or suspected, to express either the HKNG1 gene,the GNKH gene, or the TS gene or each of the HKNG1, the GNKH and the TSgenes. The protein isolation methods employed herein may, for example,be such as those described in Harlow and Lane (1988, “Antibodies: ALaboratory Manual”, Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y.). The isolated cells can be derived from cell culture orfrom a patient. The analysis of cells taken from culture may be anecessary step in the assessment of cells to be used as part of acell-based gene therapy technique or, alternatively, to test the effectof compounds on the expression of the HKNG1, GNKH or TS gene.

[0300] Preferred diagnostic methods for the detection of gene productsof the invention, including HKNG1, GNKH and TS gene products, conservedvariants and peptide fragments thereof, may involve, for example,immunoassays wherein the HKNG1, GNKH or TS gene products or conservedvariants or peptide fragments are detected by their interaction with agene product-specific antibody (e.g., an anti-HKNG1 gene productspecific antibody, an anti-GNKH gene product specific antibody, ananti-TS gene product specific antibody).

[0301] For example, antibodies, or fragments of antibodies, such asthose described, above, in Section 5.3, may be used to quantitatively orqualitatively detect the presence of HKNG1, GNKH or TS gene products orconserved variants or peptide fragments thereof. This can beaccomplished, for example, by immunofluorescence techniques employing afluorescently labeled antibody, as described hereinbelow, coupled withlight microscopic, flow cytometric, or fluorimetric detection. Suchtechniques are especially preferred for gene products that are expressedon the cell surface.

[0302] The antibodies (or fragments thereof) useful in the presentinvention may, additionally, be employed histologically, as inimmunofluorescence or immunoelectron microscopy, for in situ detectionof gene products of the invention (e.g., of HKNG1, GNKH or TS geneproducts), conserved variants or peptide fragments thereof. In situdetection may be accomplished, e.g., by removing a histological specimenfrom a patient, and applying thereto a labeled antibody that binds to anHKNG1, GNKH or TS polypeptide. The antibody (or fragment) is preferablyapplied by overlaying the labeled antibody (or fragment) onto abiological sample. Through the use of such a procedure, it is possibleto determine the presence of the targeted gene product (e.g., the HKNG1,GNKH or TS gene product, conserved variants or peptide fragmentsthereof) in a sample, as well as its distribution in the examinedtissue. Using the present invention, those of ordinary skill willreadily recognize that any of a wide variety of histological methods(such as staining procedures) can be modified in order to achieve insitu detection of a HKNG1, GNKH or TS gene product.

[0303] Immunoassays for HKNG1, GNKH or TS gene products, conservedvariants, or peptide fragments thereof will typically compriseincubating a sample, such as a biological fluid, a tissue extract,freshly harvested cells, or lysates of cells in the presence of adetectably labeled antibody capable of identifying HKNG1, GNKH or TSgene product, conserved variants or peptide fragments thereof, anddetecting the bound antibody by any of a number of techniques well-knownin the art.

[0304] The biological sample may be brought in contact with andimmobilized onto a solid phase support or carrier, such asnitrocellulose, that is capable of immobilizing cells, cell particles orsoluble proteins. The support may then be washed with suitable buffersfollowed by treatment with the detectably labeled antibody (e.g.,detectably labeled anti-HKNG1 gene product specific antibody, detectablylabeled anti-GNKH gene product specific antibody, or detectably labeledanti-TS gene product specific antibody). The solid phase support maythen be washed with the buffer a second time to remove unbound antibody.The amount of bound label on the solid support may then be detected byconventional means.

[0305] By “solid phase support or carrier” is intended any supportcapable of binding an antigen or an antibody. Well-known supports orcarriers include glass, polystyrene, polypropylene, polyethylene,dextran, nylon, amylases, natural and modified celluloses,polyacrylamides, gabbros, and magnetite. The nature of the carrier canbe either soluble to some extent or insoluble for the purposes of thepresent invention. The support material may have virtually any possiblestructural configuration so long as the coupled molecule is capable ofbinding to an antigen or antibody. Thus, the support configuration maybe spherical, as in a bead, or cylindrical, as in the inside surface ofa test tube, or the external surface of a rod. Alternatively, thesurface may be flat such as a sheet, test strip, etc. Preferred supportsinclude polystyrene beads. Those skilled in the art will know many othersuitable carriers for binding antibody or antigen, or will be able toascertain the same by use of routine experimentation.

[0306] One of the ways in which the antibody can be detectably labeledis by linking the same to an enzyme, such as for use in an enzymeimmunoassay (EIA) (Voller, A., “The Enzyme Linked Immunosorbent Assay(ELISA)”, 1978, Diagnostic Horizons 2:1-7, Microbiological AssociatesQuarterly Publication, Walkersville, Md.); Voller, A. et al., 1978, J.Clin. Pathol. 31:507-520; Butler, J. E., 1981, Meth. Enzymol.73:482-523; Maggio, E. (ed.), 1980, Enzyme Immunoassay, CRC Press, BocaRaton, Fla.; Ishikawa, E. et al., (eds.), 1981, Enzyme Immunoassay,Kgaku Shoin, Tokyo). The enzyme which is bound to the antibody willreact with an appropriate substrate, preferably a chromogenic substrate,in such a manner as to produce a chemical moiety that can be detected,for example, by spectrophotometric, fluorimetric or by visual means.Enzymes that can be used to detectably label the antibody include, butare not limited to, malate dehydrogenase, staphylococcal nuclease,delta-5-steroid isomerase, yeast alcohol dehydrogenase,α-glycerophosphate, dehydrogenase, triose phosphate isomerase,horseradish peroxidase, alkaline phosphatase, asparaginase, glucoseoxidase, β-galactosidase, ribonuclease, urease, catalase,glucose-6-phosphate dehydrogenase, glucoamylase andacetylcholinesterase. The detection can be accomplished by colorimetricmethods that employ a chromogenic substrate for the enzyme.Alternatively, detection can be accomplished by incubating the enzymelabeled antibodies with a substrate that can be catalytically convertedto a chemiluminescent product (see below) and detecting the luminescencethat arises during the course of a chemical reaction. Detection may alsobe accomplished by visual comparison of the extent of enzymatic reactionof a substrate in comparison with similarly prepared standards.

[0307] Detection may also be accomplished using any of a variety ofother immunoassays. For example, by radioactively labeling theantibodies or antibody fragments, it is possible to detect HKNG1, GNKHor TS gene products through the use of a radioimmunoassay (RIA) (see,for example, Weintraub, B., Principles of Radioimmunoassays, SeventhTraining Course on Radioligand Assay Techniques, The Endocrine Society,March, 1986). The radioactive isotope can be detected by such means asthe use of a gamma counter or a scintillation counter or byautoradiography.

[0308] It is also possible to label the antibody with a fluorescentcompound. When the fluorescently labeled antibody is exposed to light ofthe proper wave length, its presence can then be detected due tofluorescence. Among the most commonly used fluorescent labelingcompounds are fluorescein isothiocyanate, rhodamine, phycoerythrin,phycocyanin, allophycocyanin, o-phthaldehyde and fluorescamine.

[0309] The antibody can also be detectably labeled using fluorescenceemitting metals such as 152Eu, or others of the lanthanide series. Thesemetals can be attached to the antibody using such metal chelating groupsas diethylenetriaminepentacetic acid (DTPA) orethylenediaminetetraacetic acid (EDTA).

[0310] The antibody also can be detectably labeled by coupling it to achemiluminescent compound. The presence of the chemiluminescent-taggedantibody is then determined by detecting the presence of luminescencethat arises during the course of a chemical reaction. Examples ofparticularly useful chemiluminescent labeling compounds are luminol,isoluminol, theromatic acridinium ester, imidazole, acridinium salt andoxalate ester.

[0311] Likewise, a bioluminescent compound may be used to label theantibody of the present invention. Bioluminescence is a type ofchemiluminescence found in biological systems in which a catalyticprotein increases the efficiency of the chemiluminescent reaction. Thepresence of a bioluminescent protein is determined by detecting thepresence of luminescence. Important bioluminescent compounds forpurposes of labeling are luciferin, luciferase and aequorin.

[0312] Further, an antibody (or fragment thereof) can be conjugated to atherapeutic moiety such as a cytotoxin, a therapeutic agent, a drugmoiety, or a radioactive metal ion. A cytotoxin or cytotoxic agentincludes any agent that is detrimental to cells. Examples include taxol,cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin,etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin,daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin,actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine,tetracaine, lidocaine, propranolol, and puromycin and analogs orhomologs thereof. Therapeutic agents include, but are not limited to,antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine,cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g.,mechlorethamine, thioepa chlorambucil, melphalan, carmustine (BSNU) andlomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol,streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP)cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) anddoxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin),bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents(e.g., vincristine and vinblastine).

[0313] The conjugates of the invention can be used for modifying a givenbiological response, the drug moiety is not to be construed as limitedto classical chemical therapeutic agents. For example, the drug moietymay be a protein or polypeptide possessing a desired biologicalactivity. Such proteins may include, for example, a toxin such as abrin,ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such astumor necrosis factor, .alpha.-interferon, .beta.-interferon, nervegrowth factor, platelet derived growth factor, tissue plasminogenactivator; or, biological response modifiers such as, for example,lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”),interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor(“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or othergrowth factors.

[0314] Techniques for conjugating such therapeutic moiety to antibodiesare well known, see, e.g., Arnon et al., “Monoclonal Antibodies ForImmunotargeting Of Drugs In Cancer Therapy”, in Monoclonal AntibodiesAnd Cancer Therapy, Reisfeld et al. (eds.), pp. 243-56 (Alan R. Liss,Inc. 1985); Hellstrom et al., “Antibodies For Drug Delivery”, inControlled Drug Delivery (2nd Ed.), Robinson et al. (eds.), pp. 623-53(Marcel Dekker, Inc. 1987); Thorpe, “Antibody Carriers Of CytotoxicAgents In Cancer Therapy: A Review”, in Monoclonal Antibodies '84:Biological And Clinical Applications, Pinchera et al. (eds.), pp.475-506 (1985); “Analysis, Results, And Future Prospective Of TheTherapeutic Use Of Radiolabeled Antibody In Cancer Therapy”, inMonoclonal Antibodies For Cancer Detection And Therapy, Baldwin et al.(eds.), pp. 303-16 (Academic Press 1985), and Thorpe et al., “ThePreparation And Cytotoxic Properties Of Antibody-Toxin Conjugates”,Immunol. Rev., 62:119-58 (1982).

[0315] Alternatively, an antibody can be conjugated to a second antibodyto form an antibody heteroconjugate as described by Segal in U.S. Pat.No. 4,676,980.

[0316] Accordingly, in one aspect, the invention provides substantiallypurified antibodies or fragments thereof, and non-human antibodies orfragments thereof, which antibodies or fragments specifically bind to apolypeptide comprising an amino acid sequence selected from the groupconsisting of: the amino acid sequence of any one of SEQ ID NOs: 2, 4,39, 41, 43, 45, 49, 51, 66, 75, 76, 110, 112, 114, 120, 131, 132, 133,135, 137, 139, 142, or an amino acid sequence encoded by the the cDNA ofATCC® No. ); a fragment of at least 15 amino acid residues of the aminoacid sequence of any one of SEQ ID NOs: 2, 4, 39, 41, 43, 45, 49, 51,66, 75, 76, 110, 112, 114, 120, 131, 132, 133, 135, 137, 139, 142, anamino acid sequence which is at least 95%, 96%, 97%, 98%, or 99%identical to the amino acid sequence of any one of SEQ ID NOs: 2, 4, 39,41, 43, 45, 49, 51, 66, 75, 76, 110, 112, 114, 120, 131, 132, 133, 135,137, 139, 142, wherein the percent identity is determined using theALIGN program of the GCG software package with a PAM120 weight residuetable, a gap length penalty of 12, and a gap penalty of 4; and an aminoacid sequence which is encoded by a nucleic acid molecule whichhybridizes to the nucleic acid molecule consisting of any one of SEQ IDNOs: 1, 3, 5, 6, 7, 34, 35, 36, 37, 38, 40, 42, 44, 46, 47, 48, 65, 73,74, 109, 111, 113, 119, 121, 122, 123, 124, 134, 136, 138, 140, 141,143, or the cDNA of ATCC® No., or a complement thereof, under conditionsof hybridization of 6×SSC at 45° C. and washing in 0.2×SSC, 0.1% SDS at65° C. In various embodiments, the substantially purified antibodies ofthe invention, or fragments thereof, can be human, non-human, chimericand/or humanized antibodies.

[0317] In another aspect, the invention provides non-human antibodies orfragments thereof, which antibodies or fragments specifically bind to apolypeptide comprising an amino acid sequence selected from the groupconsisting of: the amino acid sequence of any one of SEQ ID NOs: 2, 4,39, 41, 43, 45, 49, 51, 66, 75, 76, 110, 112, 114, 120, 131, 132, 133,135, 137, 139, 142, or an amino acid sequence encoded by the cDNA ofATCC® No.; a fragment of at least 15 amino acid residues of the aminoacid sequence of any one of SEQ ID NOs: 2, 4, 39, 41, 43, 45, 49, 51,66, 75, 76, 110, 112, 114, 120, 131, 132, 133, 135, 137, 139, 142, anamino acid sequence which is at least 95% identical to the amino acidsequence of any one of SEQ ID NOs: 2, 4, 39, 41, 43, 45, 49, 51, 66, 75,76, 110, 112, 114, 120, 131, 132, 133, 135, 137, 139, 142, wherein thepercent identity is determined using the ALIGN program of the GCGsoftware package with a PAM 120 weight residue table, a gap lengthpenalty of 12, and a gap penalty of 4; and an amino acid sequence whichis encoded by a nucleic acid molecule which hybridizes to the nucleicacid molecule consisting of any one of SEQ ID NOs: 1, 3, 5, 6, 7, 34,35, 36, 37, 38, 40, 42, 44, 46, 47, 48, 65, 73, 74, 109, 111, 113, 119,121, 122, 123, 124, or the cDNA of ATCC® No., or a complement thereof,under conditions of hybridization of 6×SSC at 45° C. and washing in0.2×SSC, 0.1% SDS at 65° C. Such non-human antibodies can be goat,mouse, sheep, horse, chicken, rabbit, or rat antibodies. Alternatively,the non-human antibodies of the invention can be chimeric and/orhumanized antibodies. In addition, the non-human antibodies of theinvention can be polyclonal antibodies or monoclonal antibodies.

[0318] In still a further aspect, the invention provides monoclonalantibodies or fragments thereof, which antibodies or fragmentsspecifically bind to a polypeptide comprising an amino acid sequenceselected from the group consisting of: the amino acid sequence of anyone of SEQ ID NOs: 2, 4, 39, 41, 43, 45, 49, 51, 66, 75, 76, 110, 112,114, 120, 131, 132, 133, 135, 137, 139, 142, or an amino acid sequenceencoded by the cDNA of ATCC® No.; a fragment of at least 15 amino acidresidues of the amino acid sequence of any one of SEQ ID NOs: 2, 4, 39,41, 43, 45, 49, 51, 66, 75, 76, 110, 112, 114, 120, 131, 132, 133, 135,137, 139, 142, an amino acid sequence which is at least 95% identical tothe amino acid sequence of any one of SEQ ID NOs: 2, 4, 39, 41, 43, 45,49, 51, 66, 75, 76, 110, 112, 114, 120, 131, 132, 133, 135, 137, 139,142, wherein the percent identity is determined using the ALIGN programof the GCG software package with a PAM120 weight residue table, a gaplength penalty of 12, and a gap penalty of 4; and an amino acid sequencewhich is encoded by a nucleic acid molecule which hybridizes to thenucleic acid molecule consisting of any one of SEQ ID NOs: 1, 3, 5, 6,7, 34, 35, 36, 37, 38, 40, 42, 44, 46, 47, 48, 65, 73, 74, 109, 111,113, 119, 121, 122, 123, 124, or the cDNA of ATCC® No., or a complementthereof, under conditions of hybridization of 6×SSC at 45° C. andwashing in 0.2×SSC, 0.1% SDS at 65° C. The monoclonal antibodies can behuman, humanized, chimeric and/or non-human antibodies.

[0319] The substantially purified antibodies or fragments thereofspecifically bind to a signal peptide, a secreted sequence, anextracellular domain, a transmembrane or a cytoplasmic domain of apolypeptide of the invention. In one embodiment, the substantiallypurified antibodies or fragments thereof, the human or non-humanantibodies or fragments thereof, and/or the monoclonal antibodies orfragments thereof, of the invention specifically bind to a secretedsequence or an extracellular domain of the amino acid sequence of SEQ IDNO: 142. Preferably, the secreted sequence or extracellular domain towhich the antibody, or fragment thereof, binds comprises from aboutamino acids 1-186 of SEQ ID NO:142 (SEQ ID NO:144), and from amino acids244-313 of SEQ ID NO:142 (SEQ ID NO:145).

[0320] Any of the antibodies of the invention can be conjugated to atherapeutic moiety or to a detectable substance. Non-limiting examplesof detectable substances that can be conjugated to the antibodies of theinvention are an enzyme, a prosthetic group, a fluorescent material, aluminescent material, a bioluminescent material, and a radioactivematerial.

[0321] The invention also provides a kit containing an antibody of theinvention conjugated to a detectable substance, and instructions foruse. Still another aspect of the invention is a pharmaceuticalcomposition comprising an antibody of the invention and apharmaceutically acceptable carrier. In one embodiment, thepharmaceutical composition contains an antibody of the invention, atherapeutic moiety, and a pharmaceutically acceptable carrier.

[0322] Still another aspect of the invention is a method of making anantibody that specifically recognizes HKNG1, GNKH or TS, the methodcomprising immunizing a mammal with a polypeptide. The polypeptide usedas an immungen comprises an amino acid sequence selected from the groupconsisting of: the amino acid sequence of any one of SEQ ID NOs: 2, 4,39, 41, 43, 45, 49, 51, 66, 75, 76, 110, 112, 114, 120, 131, 132, 133,135, 137, 139, 142, or an amino acid sequence encoded by the cDNA ofATCC® No.; a fragment of at least 15 amino acid residues of the aminoacid sequence of any one of SEQ ID NOs: 2, 4, 39, 41, 43, 45, 49, 51,66, 75, 76, 110, 112, 114, 120, 131, 132, 133, 135, 137, 139, 142, anamino acid sequence which is at least 95%, 96%, 97%, 98%, or 99%identical to the amino acid sequence of any one of SEQ ID NOs: 2, 4, 39,41, 43, 45, 49, 51, 66, 75, 76, 110, 112, 114, 120, 131, 132, 133, 135,137, 139, 142, wherein the percent identity is determined using theALIGN program of the GCG software package with a PAM120 weight residuetable, a gap length penalty of 12, and a gap penalty of 4; and an aminoacid sequence which is encoded by a nucleic acid molecule whichhybridizes to the nucleic acid molecule consisting of any one of SEQ IDNOs: 1, 3, 5, 6, 7, 34, 35, 36, 37, 38, 40, 42, 44, 46, 47, 48, 65, 73,74, 109, 111, 113, 119, 121, 122, 123, 124, or the cDNA of ATCC® No., ora complement thereof, under conditions of hybridization of 6×SSC at 45°C. and washing in 0.2×SSC, 0.1% SDS at 65° C. After immunization, asample is collected from the mammal that contains an antibody thatspecifically recognizes a HKNG1, GNKH or TS polypeptide as exemplifiedin SEQ ID NOs: 2, 4, 39, 41, 43, 45, 49, 51, 66, 75, 76, 110, 112, 114,120, 131, 132, 133, 135, 137, 139, 142, or portions thereof. Preferably,the polypeptide is recombinantly produced using a non-human host cell.Optionally, the antibodies can be further purified from the sample usingtechniques well known to those of skill in the art. The method canfurther comprise producing a monoclonal antibody-producing cell from thecells of the mammal. Optionally, antibodies are collected from theantibody-producing cell.

5.6. SCREENING ASSAYS FOR COMPOUNDS THAT MODULATE GENE AND/OR GENEPRODUCT ACTIVITY

[0323] This section describes assays that can be used, e.g., to identifycompounds that bind to one of the genes or gene products of the presentinvention (e.g., compounds that bind to a HKNG1 gene or gene product,compounds that bind to a GNKH gene or gene product, or compounds thatbind to a TS gene or gene product), to identify compounds that bind toproteins or to portions of proteins that interact with one of the genesor gene products of the present invention (e.g., proteins or portions ofproteins that interact with a HKNG1 gene or gene product, proteins orportions of proteins that interact with a GNKH gene or gene product, orproteins or portions of proteins that interact with a TS gene or geneproduct), compounds that modulate, e.g., interfere with, the interactionof a gene or gene product of the invention with a protein, such as aligand (e.g., compounds that modulate the interaction of a HKNG1 gene orgene product with a protein, compounds that modulate the interaction ofa GNKH gene or gene product with a protein, or compounds that modulatethe interaction of a TS gene or gene product with a protein), andcompounds that modulate the activity of a gene or gene product of theinvention (i.e., compounds that modulate the level of HKNG1, GNKH or TSgene expression and/or modulate the level of HKNG1, GNKH or TS geneproduct activity). The assays described herein can also be utilized toidentify compounds that bind to gene regulatory sequences (e.g., HKNG1,GNKH or TS gene regulatory sequences such as promoter sequences; see,e.g., Platt, 1994, J. Biol. Chem. 269:28558-28562), and thereby modulategene expression. Such compounds may include, but are not limited to,small organic molecules, such as ones that are able to cross theblood-brain barrier, gain access to and/or entry into an appropriatecell and affect expression of the HKNG1, GNKH or TS gene or some othergene involved in a HKNG1, GNKH or TS regulatory pathway.

[0324] Specifically, in vitro screening assays that can be used toidentify compounds that bind to a gene or gene product of the invention(e.g., to a HKNG1 gene or gene product, to a GNKH gene or gene product,or a TS gene or gene product) are described in Section 5.6.1,hereinbelow. Screening assays that can be used to identify proteins thatinteract with a gene or gene product of the invention (e.g. with a HKNG1gene or gene product, with a GNKH gene or gene product, or with a TSgene or gene product) are also described hereinbelow, in Section 5.6.2.Section 5.6.3, below, describes assays that can be used to identifycompounds that interfere with or potentiate interactions between a geneor gene product of the invention and another macromolecule, such as aligand (e.g., interactions between a HKNG1 gene or gene product of theinvention and a ligand, interactions between a GNKH gene or gene productof the invention and a ligand, or interactions between a TS gene or geneproduct of the invention and a ligand).

[0325] Compounds identified through such assays will be of particularinterest to one skilled in the art and may be useful, e.g., forelaborating the biological function of the genes and/or gene products ofthe present invention (i.e., for elaborating the biological function ofHKNG1, GNKH and/or TS). Such compounds may also be involved in thecontrol or regulation of mood in vivo, and can therefore be used, e.g.,in the therapeutic methods and compositions of the present invention(see, e.g., Section 5.7, below) to treat disorders, such asneuropsychiatric disorders (e.g., BAD or schizophrenia) that areassociated with or mediated by HKNG1, GNKH or TS. Accordingly,additional screening methods are described, in Section 5.6.4hereinbelow, for testing the effectiveness of compounds, includingcompounds identified in the assays described in Sections 5.6.1-5.6.3,e.g., in the treatment of disorders, such as neuropsychiatric disorders,that are associated with or mediated by HKNG1, GNKH or TS.

[0326] The compounds may include, but are not limited to, peptides suchas, for example, soluble peptides, including but not limited to,Ig-tailed fusion peptides, and members of random peptide libraries;(see, e.g., Lam, et al., 1991, Nature 354:82-84; Houghten, et al., 1991,Nature 354:84-86), and combinatorial chemistry-derived molecular librarymade of D- and/or L-configuration amino acids, phosphopeptides(including, but not limited to members of random or partiallydegenerate, directed phosphopeptide libraries; see, e.g., Songyang, etal., 1993, Cell 72:767-778), antibodies (including, but not limited to,polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or singlechain antibodies, and Fab, F(ab′)2 and Fab expression library fragments,and epitope-binding fragments thereof), and small organic or inorganicmolecules.

[0327] Such compounds may further comprise compounds, in particulardrugs or members of classes or families of drugs, known to amelioratethe symptoms of a HKNG1, GNKH or TS-mediated disorder, e.g., aneuropsychiatric disorder such as BAD or schizophrenia.

[0328] Such compounds include families of antidepressants such aslithium salts, carbamazepine, valproic acid, lysergic acid diethylamide(LSD), p-chlorophenylalanine, p-propyldopacetamide dithiocarbamatederivatives e.g., FLA 63; anti-anxiety drugs, e.g., diazepam; monoamineoxidase (MAO) inhibitors, e.g., iproniazid, clorgyline, phenelzine andisocarboxazid; biogenic amine uptake blockers, e.g., tricyclicantidepressants such as desipramine, imipramine and amitriptyline;serotonin reuptake inhibitors e.g., fluoxetine; antipsychotic drugs suchas phenothiazine derivatives (e.g., chlorpromazine (thorazine) andtrifluopromazine)), butyrophenones (e.g., haloperidol (Haldol)),thioxanthene derivatives (e.g., chlorprothixene), and dibenzodiazepines(e.g., clozapine); benzodiazepines; dopaminergic agonists andantagonists e.g., L-DOPA, cocaine, amphetamine, α-methyl-tyrosine,reserpine, tetrabenazine, benzotropine, pargyline; noradrenergicagonists and antagonists e.g., clonidine, phenoxybenzamine,phentolamine, tropolone.

5.6.1. IN VITRO SCREENING ASSAYS

[0329] In vitro systems may be readily designed, as described herein, toidentify compounds capable of binding the gene products of the presentinvention invention (e.g., to an HKNG1, GNKH or a TS gene product).Compounds identified by such assays may be useful, for example, inmodulating the activity of unimpaired and/or mutant HKNG1, GNKH or a TSgene products, may be useful in elaborating the biological function ofthe HKNG1, GNKH or a TS gene product, may be utilized in screens foridentifying compounds that disrupt normal HKNG1, GNKH or a TS geneproduct interactions, or may in themselves disrupt such interactions.

[0330] The principle of the assays used to identify compounds that bindto a gene product of the invention involves preparing a reaction mixtureof the gene product and a test compound under conditions and for a timesufficient to allow the two components to interact and bind, thusforming a complex that can be removed and/or detected in the reactionmixture. Such assays can be conducted in a variety of ways. For example,one method to conduct such an assay involves anchoring a gene product orthe invention or a test substance onto a solid support and detectingcomplexes of the gene product and test compound formed on the solidsupport at the end of the reaction.

[0331] In one embodiment of such a method, the gene product may beanchored onto a solid support, and the test compound, which is notanchored, may be labeled, either directly or indirectly. In practice,microtiter plates are conveniently utilized as the solid support in suchassays. The anchored component may be immobilized by non-covalent orcovalent attachments. For example, non-covalent attachment may beaccomplished by simply coating the solid surface with a solution of theprotein and drying. Alternatively, an immobilized antibody, preferably amonoclonal antibody, specific for the protein to be immobilized may beused to anchor the protein to the solid surface. Additionally, suchsurfaces may be prepared in advance and stored for future use.

[0332] In order to conduct the assay, the non-immobilized component isadded to the coated surface containing the anchored component. After thereaction is complete, unreacted components are removed (e.g., bywashing) under conditions such that any complexes formed will remainimmobilized on the solid surface. The detection of complexes anchored onthe solid surface can be accomplished in a number of ways. Where thepreviously non-immobilized component is pre-labeled, the detection oflabel immobilized on the surface indicates that complexes were formed.Where the previously non-immobilized component is not pre-labeled, anindirect label can be used to detect complexes anchored on the surface;e.g., using a labeled antibody specific for the previouslynon-immobilized comnponent (the antibody, in turn, may be directlylabeled or indirectly labeled with a labeled anti-Ig antibody).

[0333] Alternatively, a reaction can be conducted in a liquid phase, thereaction products separated from unreacted components, and complexesdetected; e.g., using an immobilized antibody specific for either thegene product or the test compound to anchor any complexes formed insolution, and a labeled antibody specific for the other component of thepossible complex to detect anchored complexes.

5.6.2. ASSAYS FOR PROTEINS THAT INTERACT WITH HKNG1, GNKH OR TS GENEPRODUCTS

[0334] Any method suitable for detecting protein-protein interactionsmay be used in the screening assays of the present invention to detectand/or identify interactions between proteins and a gene product of thepresent invention (e.g., interactions between a HKNG1 gene product and aprotein, interactions between a GNKH gene product and a protein, oralternatively, interactions between a TS gene product and a protein).Indeed, a variety of techniques for detecting protein-proteininteractions are well known in the art, and may be used, therefore, inthe screening assays of assays of the present invention.

[0335] Among the traditional methods that may be employed areco-immunoprecipitation, cross-linking and co-purification throughgradients or chromatographic columns. Utilizing procedures such as theseallows for the identification of proteins, including intracellularproteins, that interact with gene products of the present inventionincluding, in particular, HKNG1, GNKH or TS gene products. Onceisolated, such a protein can be identified and characterized usingstandard techniques. For example, at least a portion of the amino acidsequence of a protein that interacts with gene product of the presentinvention (e.g., a HKNG1, GNKH or TS gene product) can be ascertainedusing techniques well known to those of skill in the art, such as viathe Edman degradation technique (see, e.g., Creighton, 1983, “Proteins:Structures and Molecular Principles,” W.H. Freeman & Co., N.Y.,pp.34-49). The amino acid sequence obtained may be used as a guide forthe generation of oligonucleotide mixtures that can be used to screenfor gene sequences encoding such proteins. Screening may beaccomplished, for example, by standard hybridization or PCR techniques.Techniques for the generation of oligonucleotide mixtures and thescreening are well-known. (See, e.g., Ausubel, supra, and 1990, “PCRProtocols: A Guide to Methods and Applications,” Innis, et al., eds.Academic Press, Inc., New York).

[0336] Additionally, methods may be employed that result in thesimultaneous identification of a protein which interacts with a geneproduct of the invention and of gene encoding such a protein. Thesemethods include, for example, probing expression libraries with alabeled gene product (e.g., a labeled HKNG1, GNKH or TS gene product),using the gene product in a manner similar to the well known techniqueof antibody probing of λgt11 libraries.

[0337] One method that detects protein interactions in vivo, thetwo-hybrid system, is described in detail for illustration only and notby way of limitation. One version of this system has been described(Chien, et al., 1991, Proc. Natl. Acad. Sci. USA, 88:9578-9582) and iscommercially available from Clontech (Palo Alto, Calif.). Briefly,utilizing such a system, plasmids are constructed that encode two hybridproteins. One hybrid protein consists of the DNA-binding domain of atranscription activator protein fused to the gene product of interest(i.e., a gene product of the invention such as a HKNG1, GNKH or TS geneproduct). The other hybrid protein consists of the transcriptionactivator protein's activation domain fused to an unknown proteinencoded by a cDNA that has been recombined into this plasmid as part ofa cDNA library. The DNA-binding domain fusion plasmid and the cDNAlibrary are transformed, e.g., into a strain of the yeast Saccharomycescerevisiae that contains a reporter gene (e.g., His3 or lacZ) whoseregulatory region contains the transcription activator's binding site.Either hybrid protein alone cannot activate transcription of thereporter gene: the DNA-binding domain hybrid cannot because it does notprovide activation function and the activation domain hybrid cannotbecause it cannot localize to the activator's binding sites. Interactionof the two hybrid proteins reconstitutes the functional activatorprotein and results in expression of the reporter gene, which isdetected by an assay for the reporter gene product.

[0338] The two-hybrid system or related methodologies may be used toscreen activation domain libraries for proteins that interact with the“bait” gene product. By way of example, and not by way of limitation, agene product of the invention (e.g., HKNG1, GNKH or TS) may be used asthe bait gene product. Total genomic or cDNA sequences are fused to theDNA encoding an activation domain. This library and a plasmid encoding ahybrid of the bait gene product fused to the DNA-binding domain areco-transformed into a yeast reporter strain, and the resultingtransformants are screened for those that express the reporter gene. Forexample, a bait gene sequence, such as an open reading frame of theHKNG1, GNKH or TS gene, can be cloned into a vector such that it istranslationally fused to the DNA encoding the DNA-binding domain of theGAL4 protein. These colonies are purified and the library plasmidsresponsible for reporter gene expression are isolated. DNA sequencing isthen used to identify the proteins encoded by the library plasmids.

[0339] A cDNA library of the cell line from which proteins that interactwith the bait gene product are to be detected can be made using methodsroutinely practiced in the art. According to the particular systemdescribed herein, for example, the cDNA fragments can be inserted into avector such that they are translationally fused to the transcriptionalactivation domain of GAL4. Such a library can be co-transformed alongwith the bait gene-GAL4 fusion plasmid into a yeast strain that containsa lacZ gene driven by a promoter that contains GAL4 activation sequence.A cDNA encoded protein, fused to a GAL4 transcriptional activationdomain that interacts with bait gene product will reconstitute an activeGAL4 protein and thereby drive expression of the HIS3 gene. Coloniesthat express HIS3 can be detected by their growth on petri dishescontaining semi-solid agar based media lacking histidine. The cDNA canthen be purified from these strains, and used to produce and isolate thebait gene product-interacting protein using techniques routinelypracticed in the art.

5.6.3. ASSAYS FOR COMPOUNDS THAT INTERFERE WITH OR POTENTIATE GENEPRODUCT-MACROMOLECULAR INTERACTION

[0340] The HKNG1, GNKH and TS gene products of the present inventionmay, in vivo, interact with one or more macromolecules, includingintracellular macromolecules such as proteins. Such macromolecules caninclude, but are not limited to, nucleic acid molecules and proteinsidentified via methods such as those described, above, in Sections5.6.1-5.6.2. For purposes of this discussion, the macromolecules arereferred to herein as “binding partners”. Compounds that disrupt bindingof a HKNG1, GNKH or TS gene product binding to a binding partner may beuseful, e.g., in regulating the activity of the HKNG1, GNKH or TS geneproduct, especially mutant HKNG1, GNKH or TS gene products. Suchcompounds may include, but are not limited to molecules such aspeptides, and the like, as described, for example, in Section 5.6.2above.

[0341] The basic principle of an assay system used to identify compoundsthat interfere with or potentiate the interaction between a gene productsuch as HKNG1, GNKH or TS and a binding partner or partners involvespreparing a reaction mixture containing the gene product of interest(i.e., a gene product of the present invention such as a HKNG1, GNKH orTS gene product) and its binding partner under conditions and for a timesufficient to allow the two to interact and bind, thus forming acomplex. In order to test a compound for inhibitory activity, thereaction mixture is prepared in the presence and absence of the testcompound. The test compound may be initially included in the reactionmixture, or may be added at a time subsequent to the addition of thegene product of interest and its binding partner. Control reactionmixtures are incubated without the test compound or with a compoundwhich is known not to block complex formation. The formation of anycomplexes between the gene product and the binding partner is thendetected. The formation of a complex in the control reaction, but not inthe reaction mixture containing the test compound, indicates that thecompound interferes with the interaction of the gene product and thebinding partner. Additionally, complex formation within reactionmixtures containing the test compound and a normal or “wild-type” geneproduct (e.g., a normal or wild-type HKNG1, GNKH or TS gene product) mayalso be compared to complex formation within reaction mixturescontaining the test compound and some variant of the same gene product(e.g., a mutant HKNG1, GNKH or TS gene product). Such a comparison maybe important, e.g., in those cases wherein it is desirable to identifycompounds that disrupt interactions of a mutant but not a normal geneproduct of the invention.

[0342] In order to test a compound for potentiating activity (i.e.,compounds that enhance complex formation between a gene product and itsbinding partner), the reaction mixture is prepared in the presence andabsence of the test compound. The test compound may be initiallyincluded in the reaction mixture, or may be added at a time subsequentto the addition of the gene product and its binding partner. Controlreaction mixtures are incubated without the test compound or with acompound which is known not to block complex formation. The formation ofany complexes between the gene product and the binding partner is thendetected. Increased formation of a complex in the reaction mixturecontaining the test compound, but not in the control reaction, indicatesthat the compound enhances and therefore potentiates the interaction ofthe gene product and the binding partner. Additionally, complexformation within reaction mixtures containing the test compound and anormal or wild-type gene product, such as a normal or wild-type HKNG1,GNKH or TS gene product, may also be compared to complex formationwithin reaction mixtures containing the test compound and a variant ofthe same gene product, such as a mutant HKNG1, GNKH or TS gene product).This comparison may be important in those cases wherein it is desirableto identify compounds that enhance interactions of mutant but not normalHKNG1, GNKH or TS gene product.

[0343] In alternative embodiments, the above assays may be performedusing a reaction mixture containing a gene product of interest (e.g.,HKNG1, GNKH or TS), a binding partner, and a third compound whichdisrupts or enhances binding of the gene product to the binding partner.The reaction mixture is prepared and incubated in the presence andabsence of the test compound, as described above, and the formation ofany complexes between the gene product and the binding partner isdetected. In this embodiment, the formation of a complex in the reactionmixture containing the test compound, but not in the control reaction,indicates that the test compound interferes with the ability of thesecond compound to disrupt binding of the gene product to its bindingpartner.

[0344] The assays for compounds that interfere with or potentiate theinteraction of a gene product of the invention (i.e., a HKNG1, GNKH orTS gene product) and binding partners can be conducted in aheterogeneous or homogeneous format. Heterogeneous assays involveanchoring either the gene product or the binding partner onto a solidsupport and detecting complexes formed on the solid support at the endof the reaction. In homogeneous assays, the entire reaction is carriedout in a liquid phase. In either approach, the order of addition ofreactants can be varied to obtain different information about thecompounds being tested. For example, test compounds that interfere withor potentiate the interaction between a gene products of the inventionand its binding partner or partners, e.g., by competition, can beidentified by conducting the reaction in the presence of the testsubstance; i.e., by adding the test substance to the reaction mixtureprior to or simultaneously with the gene product and its interactivebinding partner. Alternatively, test compounds that disrupt preformedcomplexes (e.g., compounds with higher binding constants that displaceone of the components from the complex), can be tested by adding thetest compound to the reaction mixture after complexes have been formed.The various formats are described briefly below.

[0345] In a heterogeneous assay system, either the gene product ofinterest (e.g., HKNG1, GNKH or TS) or the interactive binding partner,is anchored onto a solid surface, while the non-anchored species islabeled, either directly or indirectly. In practice, microtiter platesare conveniently utilized. The anchored species may be immobilized bynon-covalent or covalent attachments. Non-covalent attachment may beaccomplished simply by coating the solid surface with a solution of theHKNG1, GNKH or TS gene product or binding partner and drying.Alternatively, an immobilized antibody specific for the species to beanchored may be used to anchor the species to the solid surface. Thesurfaces may be prepared in advance and stored.

[0346] In order to conduct the assay, the partner of the immobilizedspecies is exposed to the coated surface with or without the testcompound. After the reaction is complete, unreacted components areremoved (e.g., by washing) and any complexes formed will remainimmobilized on the solid surface. The detection of complexes anchored onthe solid surface can be accomplished in a number of ways. Where thenon-immobilized species is pre-labeled, the detection of labelimmobilized on the surface indicates that complexes were formed. Wherethe non-immobilized species is not pre-labeled, an indirect label can beused to detect complexes anchored on the surface; e.g., using a labeledantibody specific for the initially non-immobilized species (theantibody, in turn, may be directly labeled or indirectly labeled with alabeled anti-Ig antibody). Depending upon the order of addition ofreaction components, test compounds that inhibit complex formation orthat disrupt preformed complexes can be detected.

[0347] Alternatively, the reaction can be conducted in a liquid phase inthe presence or absence of the test compound, the reaction productsseparated from unreacted components, and complexes detected; e.g., usingan immobilized antibody specific for one of the binding components toanchor any complexes formed in solution, and a labeled antibody specificfor the other partner to detect anchored complexes. Again, dependingupon the order of addition of reactants to the liquid phase, testcompounds that inhibit complex formation or that disrupt preformedcomplexes can be identified.

[0348] In an alternate embodiment of the invention, a homogeneous assaycan be used. In this approach, a preformed complex of the gene productof interest (e.g., HKNG1, GNKH or TS) and the interactive bindingpartner is prepared in which either the gene product or its bindingpartners is labeled, but the signal generated by the label is quencheddue to complex formation (see, e.g., U.S. Pat. No. 4,109,496 byRubenstein which utilizes this approach for immunoassays). The additionof a test substance that competes with and displaces one of the speciesfrom the preformed complex will result in the generation of a signalabove background. In this way, test substances that disrupt interactionsbetween a gene product of the invention (e.g., HKNG1, GNKH or TS) andits binding partner or partners can be identified.

[0349] In another embodiment of the invention, these same techniques canbe employed using peptide fragments that correspond to the bindingdomains of the gene product of interest (e.g., HKNG1, GNKH or TS) and/orthe binding partner (in cases where the binding partner is a protein),in place of one or both of the full length proteins. Any number ofmethods routinely practiced in the art can be used to identify andisolate the binding sites. These methods include, but are not limitedto, mutagenesis of the gene encoding one of the proteins and screeningfor disruption of binding in a co-immunoprecipitation assay.Compensating mutations in the gene encoding the second species in thecomplex can then be selected. Sequence analysis of the genes encodingthe respective proteins will reveal the mutations that correspond to theregion of the protein involved in interactive binding. Alternatively,one protein can be anchored to a solid surface using methods describedin this Section above, and allowed to interact with and bind to itslabeled binding partner, which has been treated with a proteolyticenzyme, such as trypsin. After washing, a short, labeled peptidecomprising the binding domain may remain associated with the solidmaterial, which can be isolated and identified by amino acid sequencing.Also, once the gene coding for the segments is engineered to expresspeptide fragments of the protein, it can then be tested for bindingactivity and purified or synthesized.

[0350] For example, and not by way of limitation, a HKNG1, GNKH or TSgene product can be anchored to a solid material as described, above, inthis Section by: (a) making a GST-HKNG1 fusion protein, in the case ofan HKNG1 gene product, a GST-GNKH fusion protein, in the case of a GNKHgene product, or a GST-TS fusion protein, in the case of a TS geneproduct and (b) allowing it to bind to glutathione agarose beads. Thebinding partner can be labeled with a radioactive isotope, such as ³⁵S,and cleaved with a proteolytic enzyme such as trypsin. Cleavage productscan then be added to the anchored fusion protein and allowed to bind.After washing away unbound peptides, labeled bound material,representing the binding partner binding domain, can be eluted,purified, and analyzed for amino acid sequence by well-known methods.Peptides so identified can be produced synthetically or produced usingrecombinant DNA technology.

5.6.4. IDENTIFICATION OF COMPOUNDS THAT AMELIORATE A HKNG1-, A GNKH- ORA TS-MEDIATED DISORDER

[0351] Compounds, including but not limited to binding compoundsidentified, e.g., via the assay techniques described hereinabove inSections 5.6.1-5.6.3, can also be tested for the ability to amelioratesymptoms of a disorder that is associated with and/or mediated by a geneproduct of the invention including, for example, a disorder associatedwith and/or mediated by a HKNG1, GNKH or TS gene product. In particular,as demonstrated in the Examples presented herein below, the HKNG1, GNKHand TS genes of the present invention are located in a region of humanchromosome 18p which is associated with central nervous system (CNS)disorders such as neuropsychiatric disorders including, for example,bipolar affective (mood) disorders (e.g., severe bipolar affectivedisorder or BP-I and bipolar affective disorder with hypomania and majordepression or BP-II) and schizophrenia. Thus, compounds identified,e.g., via the above-described screening assays can be treated for theability of ameliorate such disorders.

[0352] It is also noted that the assays described herein can alsoidentify compounds that affect HKNG1, GNKH or TS activity, e.g., byaffecting HKNG1, GNKH or TS gene expression, or by affecting the levelof HKNG1, GNKH or TS gene product activity. For example, compounds canbe identified that are involved in another step in the pathway in whichthe HKNG1 gene and/or HKNG1 gene product is involved and, by affectingthis same pathway, can modulate the effect of HKNG1 on the developmentof a HKNG1-mediated disorder. Likewise, compounds can also be identifiedthat are involved in another step in the pathway in which the GNKH geneand/or GNKH gene product is involved and, by affecting this samepathway, can modulate the effect of GNKH on the development of aGNKH-mediated disorder. Likewise, compounds can also be identified thatare involved in another step in the pathway in which the TS gene and/orTS gene product is involved and, by affecting this same pathway, canmodulate the effect of TS on the development of a TS-mediated disorder.Such compounds can therefore be used, e.g., as part of a therapeuticmethod for the treatment of the disorder, as described in Section 5.7,below.

[0353] Described hereinbelow are cell-based and animal model-basedassays for the identification of compounds exhibiting such an ability toameliorate symptoms of a disorder, such as a neuropsychiatric disorder(e.g., BAD or schizophrenia), that is associated with and/or mediated bya gene product of the invention (e.g., HKNG1, GNKH or TS).

[0354] First, cell-based systems can be used to identify compounds thatmay act to ameliorate symptoms of such a disorder. Such cell systems caninclude, for example, recombinant or non-recombinant cells, such as celllines, that express the HKNG1 gene or, recombinant or non-recombinantcells or cell lines that express the GNKH gene, or alternatively,recombinant or non-recombinant cells or cell lines that express the TSgene. In utilizing such cell systems, cells that express HKNG1, GNKH orTS can be exposed to a compound suspected of exhibiting an ability toameliorate symptoms of a disorder, such as a neuropsychiatric disorder(e.g., BAD or schizophrenia), that is mediated by or associated withHKNG1, GNKH or TS. Preferably, the cells are exposed to the compound ata sufficient concentration and for a sufficient time to elicit such anamelioration of such symptoms in the exposed cells. After exposure, thecells can be assayed to measure alterations in the expression of theHKNG1, GNKH or TS gene, e.g., by assaying cell lysates for HKNG1, GNKHor TS mRNA transcripts (e.g., by Northern analysis) or for HKNG1, GNKHor TS gene products expressed by the cells. Compounds that modulateexpression of the HKNG I, GNKH or TS gene are good candidates astherapeutics, e.g., in the therapeutic methods described in Section 5.7,below.

[0355] Animal-based systems or models of a disorder, such as aneuropsychiatric disorder (e.g., BAD or schizophrenia) associated withor mediated by a gene or gene product of the invention (e.g., HKNG1,GNKH or TS) can also be used to identify compounds capable ofameliorating symptoms of the disorder. Such animal-based systems andmodels include, for example, transgenic animals, such as the transgenicanimals described in Section 5.1, above (e.g., transgenic mice),containing a human or altered form of a HKNG1, GNKH or TS gene.

[0356] Such animal-based systems and models can be used, e.g., as testsubstrates for the identification of drugs, pharmaceuticals, therapiesand interventions. For example, animal models can be exposed to acompound suspected of exhibiting an ability to ameliorate symptoms of adisorder, such as a neuropsychiatric disorder (e.g., BAD orschizophrenia) associated with or mediated by HKNG1, GNKH or TS.Preferably, the animal models are exposed to the compound at sufficientconcentration and for a sufficient time to elicity such an ameliorationof symptoms of the disorder. The response of the animals to the exposurecan be monitored, e.g., by assessing the reversal of symptoms of thedisorder.

[0357] As the skilled artisan will readily appreciate, any compound ortreatment that reverses any aspect which application claims the benefitof U.S. provisional application serial No. 60/078,044, filed on Mar. 16,1998; of provisional application No. 60/088,312, filed on Jun. 5, 1998;and of provisional application No. 60/106,056 filed on Oct. 28, 1998,which application claims the benefit of U.S. provisional applicationserial No. 60/078,044, filed on Mar. 16, 1998; of provisionalapplication No. 60/088,312, filed on Jun. 5, 1998; and of provisionalapplication No. 60/106,056 filed on Oct. 28, 1998, t of symptoms of adisorder, such as a neuropsychiatric disorder (e.g., BAD orschizophrenia) is considered a candidate for human therapeuticintervention in such disorders. Dosages of test agents, e.g., for humanclinical trials, can be determined, as discussed below, in Section5.8.1, by deriving appropriate dose-response curves.

5.7. METHODS FOR DIAGNOSIS AND PROGNOSTICATION OF HKNG1-, GNKH- ANDTS-RELATED-DISORDERS

[0358] The methods described herein can furthermore be utilized asdiagnostic or prognostic assays to identify subjects having or at riskof developing a disease or disorder associated with aberrant expressionor activity of a polypeptide of the invention. For example, the assaysdescribed herein, such as the preceding diagnostic assays or thefollowing assays, can be utilized to identify a subject having or atrisk of developing a disorder associated with aberrant expression oractivity of a polypeptide of the invention. Alternatively, theprognostic assays can be utilized to identify a subject having or atrisk for developing such a disease or disorder. Thus, the presentinvention provides a method in which a test sample is obtained from asubject and a polypeptide or nucleic acid (e.g., mRNA, genomic DNA) ofthe invention is detected, wherein the presence of the polypeptide ornucleic acid is diagnostic for a subject having or at risk of developinga disease or disorder associated with aberrant expression or activity ofthe polypeptide. As used herein, a “test sample” refers to a biologicalsample obtained from a subject of interest. For example, a test samplecan be a biological fluid (e.g., serum), cell sample, or tissue.

[0359] Furthermore, the prognostic assays described herein can be usedto determine whether a subject can be administered an agent (e.g., anagonist, antagonist, peptidomimetic, protein, peptide, nucleic acid,small molecule, or other drug candidate) to treat a disease or disorderassociated with aberrant expression or activity of a polypeptide of theinvention. For example, such methods can be used to determine whether asubject can be effectively treated with a specific agent or class ofagents (e.g., agents of a type which decrease activity of thepolypeptide). Thus, the present invention provides methods fordetermining whether a subject can be effectively treated with an agentfor a disorder associated with aberrant expression or activity of apolypeptide of the invention in which a test sample is obtained and thepolypeptide or nucleic acid encoding the polypeptide is detected (e.g.,wherein the presence of the polypeptide or nucleic acid is diagnosticfor a subject that can be administered the agent to treat a disorderassociated with aberrant expression or activity of the polypeptide).

[0360] The methods of the invention can also be used to detect geneticlesions or mutations in a gene of the invention, thereby determining ifa subject with the lesioned gene is at risk for a disorder characterizedaberrant expression or activity of a polypeptide of the invention. Inpreferred embodiments, the methods include detecting, in a sample ofcells from the subject, the presence or absence of a genetic lesion ormutation characterized by at least one of an alteration affecting theintegrity of a gene encoding the polypeptide of the invention, or themis-expression of the gene encoding the polypeptide of the invention.For example, such genetic lesions or mutations can be detected byascertaining the existence of at least one of: 1) a deletion of one ormore nucleotides from the gene; 2) an addition of one or morenucleotides to the gene; 3) a substitution of one or more nucleotides ofthe gene; 4) a chromosomal rearrangement of the gene; 5) an alterationin the level of a messenger RNA transcript of the gene; 6) an aberrantmodification of the gene, such as of the methylation pattern of thegenomic DNA; 7) the presence of a non-wild type splicing pattern of amessenger RNA transcript of the gene; 8) a non-wild type level of a theprotein encoded by the gene; 9) an allelic loss of the gene; and 10) aninappropriate post-translational modification of the protein encoded bythe gene. As described herein, there are a large number of assaytechniques known in the art which can be used for detecting lesions in agene.

[0361] In certain embodiments, detection of the lesion involves the useof a probe/primer in a polymerase chain reaction (PCR) (see, e.g., U.S.Pat. Nos. 4,683,195 and 4,683,202), such as anchor PCR or RACE PCR, or,alternatively, in a ligation chain reaction (LCR) (see, e.g., Landegranet al. (1988) Science 241:1077-1080; and Nakazawa et al. (1994) Proc.Natl. Acad. Sci. USA 91:360-364), the latter of which can beparticularly useful for detecting point mutations in a gene (see, e.g.,Abravaya et al. (1995) Nucleic Acids Res. 23:675-682). This method caninclude the steps of collecting a sample of cells from a patient,isolating nucleic acid (e.g., genomic, mRNA or both) from the cells ofthe sample, contacting the nucleic acid sample with one or more primerswhich specifically hybridize to the selected gene under conditions suchthat hybridization and amplification of the gene (if present) occurs,and detecting the presence or absence of an amplification product, ordetecting the size of the amplification product and comparing the lengthto a control sample. It is anticipated that PCR and/or LCR may bedesirable to use as a preliminary amplification step in conjunction withany of the techniques used for detecting mutations described herein.

[0362] Alternative amplification methods include: self sustainedsequence replication (Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA87:1874-1878), transcriptional amplification system (Kwoh, et al. (1989)Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi etal. (1988) Bio/Technology 6:1197), or any other nucleic acidamplification method, followed by the detection of the amplifiedmolecules using techniques well known to those of skill in the art.These detection schemes are especially useful for the detection ofnucleic acid molecules if such molecules are present in very lownumbers.

[0363] In an alternative embodiment, mutations in a selected gene from asample cell can be identified by alterations in restriction enzymecleavage patterns. For example, sample and control DNA is isolated,amplified (optionally), digested with one or more restrictionendonucleases, and fragment length sizes are determined by gelelectrophoresis and compared. Differences in fragment length sizesbetween sample and control DNA indicates mutations in the sample DNA.Moreover, the use of sequence specific ribozymes (see, e.g., U.S. Pat.No. 5,498,531) can be used to score for the presence of specificmutations by development or loss of a ribozyme cleavage site.

[0364] In other embodiments, genetic mutations can be identified byhybridizing a sample and control nucleic acids, e.g., DNA or RNA, tohigh density arrays containing hundreds or thousands of oligonucleotidesprobes (Cronin et al., 1996, Human Mutation 7:244-255; Kozal et al.,1996, Nature Medicine 2:753-759). For example, genetic mutations can beidentified in two-dimensional arrays containing light-generated DNAprobes as described in Cronin et al., supra. Briefly, a firsthybridization array of probes can be used to scan through long stretchesof DNA in a sample and control to identify base changes between thesequences by making linear arrays of sequential overlapping probes. Thisstep allows the identification of point mutations. This step is followedby a second hybridization array that allows the characterization ofspecific mutations by using smaller, specialized probe arrayscomplementary to all variants or mutations detected. Each mutation arrayis composed of parallel probe sets, one complementary to the wild-typegene and the other complementary to the mutant gene.

[0365] In yet another embodiment, any of a variety of sequencingreactions known in the art can be used to directly sequence the selectedgene and detect mutations by comparing the sequence of the samplenucleic acids with the corresponding wild-type (control) sequence.(Examples of sequencing reactions include those based on techniquesdeveloped by Maxim and Gilbert, 1977, Proc. Natl. Acad. Sci. USA 74:560or Sanger, 1977, Proc. Natl. Acad. Sci. USA 74:5463). It is alsocontemplated that any of a variety of automated sequencing procedurescan be utilized when performing the diagnostic assays (1995,Bio/Techniques 19:448), including sequencing by mass spectrometry (see,e.g., PCT Publication No. WO 94/16101; Cohen et al., 1996, Adv.Chromatogr. 36:127-162; and Griffm et al., 1993, Appl. Biochem.Biotechnol. 38:147-159).

[0366] Other methods for detecting mutations in a selected gene includemethods in which protection from cleavage agents is used to detectmismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al.,1985, Science 230:1242). In general, the technique of mismatch cleavageentails providing heteroduplexes formed by hybridizing (labeled) RNA orDNA containing the wild-type sequence with potentially mutant RNA or DNAobtained from a tissue sample. The double-stranded duplexes are treatedwith an agent which cleaves single-stranded regions of the duplex suchas which will exist due to basepair mismatches between the control andsample strands. RNA/DNA duplexes can be treated with RNase to digestmismatched regions, and DNA/DNA hybrids can be treated with S1 nucleaseto digest mismatched regions. In other embodiments, either DNA/DNA orRNA/DNA duplexes can be treated with hydroxylamine or osmium tetroxideand with piperidine in order to digest mismatched regions. Afterdigestion of the mismatched regions, the resulting material is thenseparated by size on denaturing polyacrylamide gels to determine thesite of mutation. (See, e.g., Cotton et al., 1988, Proc. Natl. Acad.Sci. USA 85:4397; Saleeba et al., 1992, Methods Enzymol. 217:286-295.)In a preferred embodiment, the control DNA or RNA can be labeled fordetection.

[0367] In still another embodiment, the mismatch cleavage reactionemploys one or more proteins that recognize mismatched base pairs indouble-stranded DNA (so called “DNA mismatch repair enzymes”) in defmedsystems for detecting and mapping point mutations in cDNAs obtained fromsamples of cells. For example, the mutY enzyme of E. coli cleaves A atG/A mismatches and the thymidine DNA glycosylase from HeLa cells cleavesT at G/T mismatches (Hsu et al., 1994, Carcinogenesis 15:1657-1662).According to an exemplary embodiment, a probe based on a selectedsequence, e.g., a wild-type sequence, is hybridized to a cDNA or otherDNA product from a test cell(s). The duplex is treated with a DNAmismatch repair enzyme, and the cleavage products, if any, can bedetected from electrophoresis protocols or the like. (See, e.g., U.S.Pat. No. 5,459,039.)

[0368] In other embodiments, alterations in electrophoretic mobilitywill be used to identify mutations in genes. For example, single strandconformation polymorphism (SSCP) may be used to detect differences inelectrophoretic mobility between mutant and wild type nucleic acids(Orita et al., 1989, Proc. Natl. Acad. Sci. USA 86:2766; see alsoCotton, 1993, Mutat. Res. 285:125-144; Hayashi, 1992, Genet. Anal. Tech.Appl. 9:73-79). Single-stranded DNA fragments of sample and controlnucleic acids will be denatured and allowed to renature. The secondarystructure of single-stranded nucleic acids varies according to sequence,and the resulting alteration in electrophoretic mobility enables thedetection of even a single base change. The DNA fragments may be labeledor detected with labeled probes. The sensitivity of the assay may beenhanced by using RNA (rather than DNA), in which the secondarystructure is more sensitive to a change in sequence. In a preferredembodiment, the subject method utilizes heteroduplex analysis toseparate double stranded heteroduplex molecules on the basis of changesin electrophoretic mobility (Keen et al., 1991, Trends Genet. 7:5).

[0369] In yet another embodiment, the movement of mutant or wild-typefragments in polyacrylamide gels containing a gradient of denaturant isassayed using denaturing gradient gel electrophoresis (DGGE) (Myers etal., 1985, Nature 313:495). When DGGE is used as the method of analysis,DNA will be modified to insure that it does not completely denature, forexample by adding a GC clamp of approximately 40 bp of high-meltingGC-rich DNA by PCR. In a further embodiment, a temperature gradient isused in place of a denaturing gradient to identify differences in themobility of control and sample DNA (Rosenbaum and Reissner, 1987,Biophys. Chem. 265:12753).

[0370] Examples of other techniques for detecting point mutationsinclude, but are not limited to, selective oligonucleotidehybridization, selective amplification, or selective primer extension.For example, oligonucleotide primers may be prepared in which the knownmutation is placed centrally and then hybridized to target DNA underconditions which permit hybridization only if a perfect match is found(Saiki et al., 1986, Nature 324:163; Saiki et al., 1989, Proc. Natl.Acad. Sci. USA 86:6230). Such allele specific oligonucleotides arehybridized to PCR amplified target DNA or a number of differentmutations when the oligonucleotides are attached to the hybridizingmembrane and hybridized with labeled target DNA.

[0371] Alternatively, allele specific amplification technology whichdepends on selective PCR amplification may be used in conjunction withthe instant invention. Oligonucleotides used as primers for specificamplification may carry the mutation of interest in the center of themolecule (so that amplification depends on differential hybridization)(Gibbs et al., 1989, Nucleic Acids Res. 17:2437-2448) or at the extreme3′ end of one primer where, under appropriate conditions, mismatch canprevent or reduce polymerase extension (Prossner, 1993, Tibtech 11:238).In addition, it may be desirable to introduce a novel restriction sitein the region of the mutation to create cleavage-based detection(Gasparini et al., 1992, Mol. Cell Probes 6:1). It is anticipated thatin certain embodiments amplification may also be performed using Taqligase for amplification (Barany, 1991, Proc. Natl. Acad. Sci. USA88:189). In such cases, ligation will occur only if there is a perfectmatch at the 3′ end of the 5′ sequence making it possible to detect thepresence of a known mutation at a specific site by looking for thepresence or absence of amplification.

[0372] The methods described herein may be performed, for example, byutilizing pre-packaged diagnostic kits comprising at least one probenucleic acid or antibody reagent described herein, which may beconveniently used, e.g., in clinical settings to diagnose patientsexhibiting symptoms or family history of a disease or illness involvinga gene encoding a polypeptide of the invention. Furthermore, any celltype or tissue, preferably peripheral blood leukocytes, in which thepolypeptide of the invention is expressed may be utilized in theprognostic assays described herein.

5.8. COMPOSITIONS AND METHODS FOR THE TREATMENT OF HKNG1-, GNKH- andTS-MEDIATED DISORDERS

[0373] This section describes methods and compositions whereby adisorder, which is associated with an/or mediated by a gene or geneproduct of the present invention, can be treated. In particular, asdemonstrated in the Examples presented herein below, the HKNG1, GNKH andTS genes of the present invention are located in a region of humanchromosome 18p which is associated with central nervous system (CNS)disorders such as neuropsychiatric disorders including, for example,bipolar affective (mood) disorders (e.g., severe bipolar affectivedisorder or BP-I and bipolar affective disorder with hypomania and majordepression or BP-II) and schizophrenia. Thus, the methods andcompositions described herein can be used, e.g., to treat CNS disordersincluding neuropsychiatric disorders such as bipolar affective (mood)disorders (e.g., severe bipolar affective disorder or BP-I and bipolaraffective disorder with hypomania and major depression or BP-II) andschizophrenia.

[0374] Such methods can comprise, for example, administering one or morecompounds that modulate the expression of a gene of the presentinvention (e.g., a HKNG1, GNKH or TS gene, particularly a mammalianHKNG1, GNKR or TS gene). The methods can also comprise, e.g.,administering compounds that modulate the synthesis or activity of agene product of the invention (e.g., a HKNG1, GNKH or TS gene product,particularly a mammalian HKNG1, GNKH or TS gene product) so thatsymptoms of the disorder are ameliorated. In other embodiments, themethods of treatment comprise treatment of a disorder, such as aneuropsychiatric disorder, resulting from a mutation of a HKNG1, GNKH orTS gene. In such embodiments, methods of treatment can comprisesupplying the subject with a cell comprising a nucleic acid moleculethat encodes an unimpaired HKNG1, GNKH or TS gene product such that thecell expresses the unimpaired HKNG1, GNKH or TS gene product andsymptoms of the disorder are ameliorated.

[0375] In certain embodiments, wherein a loss of normal function of aHKNG1 gene product results in the development of a disorder, an increasein HKNG1 gene product activity can facilitate progress towards anasymptomatic state in individuals exhibiting a deficient level of HKNG1gene expression or gene product activity. Likewise, in embodimentswherein a loss of normal function of a GNKH gene product results in thedevelopment of a disorder, an increase in GNKH gene product activity canfacilitate progress towards an asymptomatic state in individualsexhibiting a deficient level of GNKH gene expression or gene productactivity. Likewise, in embodiments wherein a loss of normal function ofa TS gene product results in the development of a disorder, an increasein TS gene product activity can facilitate progress towards anasymptomatic state in individuals exhibiting a deficient level of TSgene expression or gene product activity.

[0376] Alternatively, in certain embodiment, symptoms of a disorder suchas a neuropsychiatric disorder may be ameliorated by administering acompound that decreases the level of HKNG1 gene expression and/or HKNG1gene product activity. Likewise, symptoms of a disorder, such as aneuropsychiatric disorder, may be ameliorated by administering acompound the decreases the level of GNKH gene expression and/or GNKHgene product activity. Likewise, symptoms of a disorder, such as aneuropsychiatric disorder, may be ameliorated by administering acompound the decreases the level of TS gene expression and/or TS geneproduct activity.

[0377] Such compounds include compounds identified, e.g., via thetechniques described, above, in Section 5.8, that are capable ofmodulating HKNG1, GNKH or TS gene product activity can be administeredusing standard techniques that are well known to those of skill in theart. In certain embodiments, the compounds to be administered are toinvolve an interaction with brain cells. In such instances, theadministration techniques preferably include well known ones that allowfor a crossing of the blood-brain barrier.

[0378] In one embodiment, of the treatment methods of the invention, thecompounds administered comprise compounds, in particular drugs, whichameliorate the symptoms of a disorder described herein as aneuropsychiatric disorder (e.g., BAD or schizophrenia). Such compoundsinclude, e.g., drugs within the families of antidepressants such aslithium salts, carbamazepine, valproic acid, lysergic acid diethylamide(LSD), p-chlorophenylalanine, p-propyldopacetamide dithiocarbamatederivatives e.g., FLA 63; anti-anxiety drugs, e.g., diazepam; monoamineoxidase (MAO) inhibitors, e.g., iproniazid, clorgyline, phenelzine andisocarboxazid; biogenic amine uptake blockers, e.g., tricyclicantidepressants such as desipramine, imipramine and amitriptyline;serotonin reuptake inhibitors e.g., fluoxetine; antipsychotic drugs suchas phenothiazine derivatives (e.g., chlorpromazine (thorazine) andtrifluopromazine), butyrophenones (e.g., haloperidol (Haldol)),thioxanthene derivatives (e.g., chlorprothixene), and dibenzodiazepines(e.g., clozapine); benzodiazepines; dopaminergic agonists andantagonists e.g., L-DOPA, cocaine, amphetamine, α-methyl-tyrosine,reserpine, tetrabenazine, benzotropine, pargyline; noradrenergicagonists and antagonists e.g., clonidine, phenoxybenzamine,phentolamine, tropolone.

[0379] In another embodiment, symptoms of a disorder described herein,e.g., a neuropsychiatric disorder such as BAD or schizophrenia, may beameliorated by protein therapy methods, e.g., decreasing or increasingthe level and/or activity of a protein of the present invention (e.g.HKNG1, GNKH or TS) using, e.g., a HKNG1, GNKH or TS protein, a fusionHKNG1, GNKH or TS protein, or HKNG1, GNKH or TS peptide sequencesdescribed in Section 5.2, above; or by the administration of proteins orprotein fragments (e.g., peptides) which interact with a HKNG1, GNKH orTS gene or gene product and thereby inhibit or potentiate its activity.

[0380] Such protein therapy may include, for example, the administrationof a functional HKNG1 or GNKH protein, or fragments of an HKNG1, GNKH orTS protein (e.g., peptides) which represent functional domains of HKNG1,GNKH or TS.

[0381] In one embodiment, protein fragments or peptides representing afunctional binding domain of a HKNG1, GNKH or TS protein areadministered to an individual such that the protein fragments orpeptides bind to a HKNG1, GNKH or TS binding protein, e.g., a HKNG1,GNKH or TS receptor. Such fragments or peptides may serve, e.g., toinhibit HKNG1, GNKH or TS activity in an individual by competing with,and thereby inhibiting, binding of HKNG1, GNKH or TS to the bindingprotein, thereby ameliorating symptoms of a disorder described herein.Alternatively, such fragments or peptides may enhance HKNG1, GNKH or TSactivity in an individual by mimicking the function of HKNG1, GNKH or TSin vivo, thereby ameliorating the symptoms of a disorder describedherein.

[0382] The proteins and peptides which may be used in the methods of theinvention include synthetic (e.g., recombinant or chemicallysynthesized) proteins and peptides, as well as naturally occurringproteins and peptides. The proteins and peptides may have both naturallyoccurring and non-naturally occuring amino acid residues (e.g., D-aminoacid residues) and/or one or more non-peptide bonds (e.g., imino ,ester, hydrazide, semicarbazide, and azo bonds). The proteins orpeptides may also contain additional chemical groups (i.e., functionalgroups) present at the amino and/or carboxy termini, such that, forexample, the stability, bioavailability, and/or inhibitory activity ofthe peptide is enhanced. Exemplary functional groups include hydrophobicgroups (e.g. carbobenzoxyl, dansyl, and t-butyloxycarbonyl, groups), anacetyl group, a 9-fluorenylmethoxy-carbonyl group, and macromolecularcarrier groups (e.g., lipid-fatty acid conjugates, polyethylene glycol,or carbohydrates) including peptide groups.

5.8.1. INHIBITORY APPROACHES

[0383] In certain embodiments of the invention, symptoms of a disordermediated, e.g., by HKNG1, GNKH or TS (e.g., neuropsychiatric disorderssuch as BAD and schizophrenia) can be ameliorated by decreasing thelevel of HKNG1, GNKH or TS gene expression and/or HKNG1, GNKH or TS geneproduct activity using gene sequences (i.e., HKNG1 and/or GNKH genesequences) in conjunction with well-known antisense, gene “knock-out,”ribozyme and/or triple helix methods to decrease the level of HKNG1,GNKH or TS gene expression. Among the compounds that may exhibit theability to modulate the activity, expression or synthesis of a HKNG1,GNKH or TS gene (including the ability to ameliorate symptoms of adisorder mediated by a HKNG1, GNKH or TS gene, including aneuropsychiatric disorder, such as BAD or schizophrenia) are antisense,ribozyme, and triple helix molecules. Such molecules can be designed toreduce or inhibit either unimpaired or, if appropriate, mutant targetgene activity (i.e., HKNG1, GNKH or TS activity). Techniques for theproduction and use of such molecules are well known to those of skill inthe art.

[0384] Antisense RNA and DNA molecules act to directly block thetranslation of mRNA by hybridizing to targeted mRNA and preventingprotein translation. Antisense approaches involve the design ofoligonucleotides that are complementary to a target gene mRNA. Theantisense oligonucleotides will bind to the complementary target genemRNA transcripts and prevent translation. Absolute complementarity,although preferred, is not required.

[0385] A sequence “complementary” to a portion of an RNA, as referred toherein, means a sequence having sufficient complementarity to be able tohybridize with the RNA, forming a stable duplex; in the case ofdouble-stranded antisense nucleic acids, a single strand of the duplexDNA may thus be tested, or triplex formation may be assayed. The abilityto hybridize will depend on both the degree of complementarity and thelength of the antisense nucleic acid. Generally, the longer thehybridizing nucleic acid, the more base mismatches with an RNA it maycontain and still form a stable duplex (or triplex, as the case may be).One skilled in the art can ascertain a tolerable degree of mismatch byuse of standard procedures to determine the melting point of thehybridized complex.

[0386] In one embodiment, oligonucleotides complementary to non-codingregions of a HKNG1, GNKH or TS gene could be used in an antisenseapproach to inhibit translation of endogenous HKNG1, GNKH or TS mRNA.Antisense nucleic acids should be at least six nucleotides in length,and are preferably oligonucleotides ranging from 6 to about 50nucleotides in length. In specific aspects the oligonucleotide is atleast 10 nucleotides, at least 17 nucleotides, at least 25 nucleotidesor at least 50 nucleotides.

[0387] Regardless of the choice of target sequence, it is preferred thatin vitro studies are first performed to quantitate the ability of theantisense oligonucleotide to inhibit gene expression. It is preferredthat these studies utilize controls that distinguish between antisensegene inhibition and nonspecific biological effects of oligonucleotides.It is also preferred that these studies compare levels of the target RNAor protein with that of an internal control RNA or protein.Additionally, it is envisioned that results obtained using the antisenseoligonucleotide are compared with those obtained using a controloligonucleotide. It is preferred that the control oligonucleotide is ofapproximately the same length as the test oligonucleotide and that thenucleotide sequence of the oligonucleotide differs from the antisensesequence no more than is necessary to prevent specific hybridization tothe target sequence.

[0388] The oligonucleotides can be DNA or RNA or chimeric mixtures orderivatives or modified versions thereof, single stranded ordouble-stranded. The oligonucleotide can be modified at the base moiety,sugar moiety, or phosphate backbone, for example, to improve stabilityof the molecule, hybridization, etc. The oligonucleotide may includeother appended groups such as peptides (e.g., for targeting host cellreceptors in vivo), or agents facilitating transport across the cellmembrane (see, e.g., Letsinger, et al., 1989, Proc. Natl. Acad. Sci.U.S.A. 86:6553-6556; Lemaitre, et al., 1987, Proc. Natl. Acad. Sci.U.S.A. 84:648-652; PCT Publication No. WO88/09810, published Dec. 15,1988) or the blood-brain barrier (see, e.g., PCT Publication No.WO89/10134, published Apr. 25, 1988), hybridization-triggered cleavageagents (see, e.g., Krol et al., 1988, BioTechniques 6:958-976) orintercalating agents (see, e.g., Zon, 1988, Pharm. Res. 5:539-549). Tothis end, the oligonucleotide may be conjugated to another molecule,e.g., a peptide, hybridization triggered cross-linking agent, transportagent, hybridization-triggered cleavage agent, etc.

[0389] The antisense oligonucleotide may comprise at least one modifiedbase moiety which is selected from the group including but not limitedto 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil,hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl)uracil, 5-carboxymethylaminomethyl-2-thiouridine,5-carboxymethylaminomethyluracil, dihydrouracil,beta-D-galactosylqueosine, inosine, N6-isopentenyladenine,1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine,2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine,7-methylguanine, 5-methylaminomethyluracil,5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine,5′-methoxycarboxymethyluracil, 5-methoxyuracil,2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v),wybutoxosine, pseudouracil, queosine, 2-thiocytosine,5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil,uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v),5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w,and 2,6-diaminopurine.

[0390] The antisense oligonucleotide may also comprise at least onemodified sugar moiety selected from the group including but not limitedto arabinose, 2-fluoroarabinose, xylulose, and hexose.

[0391] In yet another embodiment, the antisense oligonucleotidecomprises at least one modified phosphate backbone selected from thegroup consisting of a phosphorothioate, a phosphorodithioate, aphosphoramidothioate, a phosphoramidate, a phosphordiamidate, amethylphosphonate, an alkyl phosphotriester, and a formacetal or analogthereof.

[0392] In yet another embodiment, the antisense oligonucleotide is anα-anomeric oligonucleotide. An α-anomeric oligonucleotide forms specificdouble-stranded hybrids with complementary RNA in which, contrary to theusual β-units, the strands run parallel to each other (Gautier, et al.,1987, Nucl. Acids Res. 15:6625-6641). The oligonucleotide is a2′-0-methylribonucleotide (Inoue, et al., 1987, Nucl. Acids Res.15:6131-6148), or a chimeric RNA-DNA analogue (Inoue, et al., 1987, FEBSLett. 215:327-330).

[0393] Oligonucleotides of the invention may be synthesized by standardmethods known in the art, e.g., by use of an automated DNA synthesizer(such as are commercially available from Biosearch, Applied Biosystems,etc.). As examples, phosphorothioate oligonucleotides may be synthesizedby the method of Stein, et al. (1988, Nucl. Acids Res. 16:3209),methylphosphonate oligonucleotides can be prepared by use of controlledpore glass polymer supports (Sarin, et al., 1988, Proc. Natl. Acad. Sci.U.S.A. 85:7448-7451), etc.

[0394] While antisense nucleotides complementary to the target genecoding region sequence could be used, those complementary to thetranscribed, untranslated region are most preferred.

[0395] Antisense molecules should be delivered to cells that express thetarget gene in vivo. A number of methods have been developed fordelivering antisense DNA or RNA to cells; e.g., antisense molecules canbe injected directly into the tissue site, or modified antisensemolecules, designed to target the desired cells (e.g., antisense linkedto peptides or antibodies that specifically bind receptors or antigensexpressed on the target cell surface) can be administered systemically.

[0396] A preferred approach to achieve intracellular concentrations ofthe antisense sufficient to suppress translation of endogenous mRNAsutilizes a recombinant DNA construct in which the antisenseoligonucleotide is placed under the control of a strong pol III or polII promoter. The use of such a construct to transfect target cells inthe patient will result in the transcription of sufficient amounts ofsingle stranded RNAs that will form complementary base pairs with theendogenous target gene transcripts and thereby prevent translation ofthe target gene mRNA. For example, a vector can be introduced e.g., suchthat it is taken up by a cell and directs the transcription of anantisense RNA. Such a vector can remain episomal or become chromosomallyintegrated, as long as it can be transcribed to produce the desiredantisense RNA. Such vectors can be constructed by recombinant DNAtechnology methods standard in the art. Vectors can be plasmid, viral,or others known in the art, used for replication and expression inmammalian cells. Expression of the sequence encoding the antisense RNAcan be by any promoter known in the art to act in mammalian, preferablyhuman cells. Such promoters can be inducible or constitutive. Suchpromoters include but are not limited to: the SV40 early promoter region(Bemoist and Chambon, 1981, Nature 290:304-310), the promoter containedin the 3′ long terminal repeat of Rous sarcoma virus (Yamamoto, et al.,1980, Cell 22:787-797), the herpes thymidine kinase promoter (Wagner, etal., 1981, Proc. Natl. Acad. Sci. U.S.A. 78:1441-1445), the regulatorysequences of the metallothionein gene (Brinster, et al., 1982, Nature296:39-42), etc. Any type of plasmid, cosmid, YAC or viral vector can beused to prepare the recombinant DNA construct which can be introduceddirectly into the tissue site. Alternatively, viral vectors can be usedthat selectively infect the desired tissue, in which case administrationmay be accomplished by another route (e.g., systemically).

[0397] Ribozyme molecules designed to catalytically cleave target genemRNA transcripts can also be used to prevent translation of target genemRNA and, therefore, expression of target gene product. (See, e.g., PCTInternational Publication WO90/11364, published Oct. 4, 1990; Sarver, etal., 1990, Science 247, 1222-1225).

[0398] Ribozymes are enzymatic RNA molecules capable of catalyzing thespecific cleavage of RNA. (For a review, see Rossi, 1994, CurrentBiology 4:469-471). The mechanism of ribozyme action involves sequencespecific hybridization of the ribozyme molecule to complementary targetRNA, followed by an endonucleolytic cleavage event. The composition ofribozyme molecules must include one or more sequences complementary tothe target gene mRNA, and must include the well known catalytic sequenceresponsible for mRNA cleavage. For this sequence, see, e.g., U.S. Pat.No. 5,093,246, which is incorporated herein by reference in itsentirety.

[0399] While ribozymes that cleave mRNA at site specific recognitionsequences can be used to destroy target gene mRNAs, the use ofhammerhead ribozymes is preferred. Hammerhead ribozymes cleave mRNAs atlocations dictated by flanking regions that form complementary basepairs with the target mRNA. The sole requirement is that the target mRNAhave the following sequence of two bases: 5′-GU-3′. Preferably, thetarget mRNA has one of the following sequences of three bases:5′-GUA-3′, 5′-GUC-3′ or 5′-GUU-3′. The construction and production ofhammerhead ribozymes is well known in the art and is described morefully, e.g., in Rufffier et al., 1990, Biochemistry 29:10695-10702; inMyers, 1995, Molecular Biology and Biotechnology: A Comprehensive DeskReference, VCH Publishers, New York, (see especially FIG. 4, page 833);and in Haseloff and Gerlach, 1988, Nature, 334:585-591, each of which isincorporated herein by reference in its entirety.

[0400] Preferably the ribozyme is engineered so that the cleavagerecognition site is located near the 5′ end of the target gene mRNA,i.e., to increase efficiency and minimize the intracellular accumulationof non-functional mRNA transcripts.

[0401] The ribozymes of the present invention also include RNAendoribonucleases (hereinafter “Cech-type ribozymes”) such as the onethat occurs naturally in Tetrahymena thermophila (known as the IVS, orL-19 IVS RNA) and that has been extensively described by Thomas Cech andcollaborators (Zaug, et al., 1984, Science, 224:574-578; Zaug and Cech,1986, Science, 231:470-475; Zaug, et al., 1986, Nature, 324:429-433;published International patent application No. WO 88/04300 by UniversityPatents Inc.; Been and Cech, 1986, Cell, 47:207-216). The Cech-typeribozymes have an eight base pair active site which hybridizes to atarget RNA sequence whereafter cleavage of the target RNA takes place.The invention encompasses those Cech-type ribozymes which target eightbase-pair active site sequences that are present in the target gene.

[0402] As in the antisense approach, the ribozymes can be composed ofmodified oligonucleotides (e.g., for improved stability, targeting,etc.) and should be delivered to cells that express the target gene invivo. A preferred method of delivery involves using a DNA construct“encoding” the ribozyme under the control of a strong constitutive polIII or pol 11 promoter, so that transfected cells will producesufficient quantities of the ribozyme to destroy endogenous target genemessages and inhibit translation. Because ribozymes unlike antisensemolecules, are catalytic, a lower intracellular concentration isrequired for efficiency.

[0403] Endogenous target gene expression can also be reduced byinactivating or “knocking out” the target gene or its promoter usingtargeted homologous recombination (e.g., see Smithies, et al., 1985,Nature 317:230-234; Thomas and Capecchi, 1987, Cell 51:503-512;Thompson, et al., 1989, Cell 5:313-321; each of which is incorporated byreference herein in its entirety). For example, a mutant, non-functionaltarget gene (or a completely unrelated DNA sequence) flanked by DNAhomologous to the endogenous target gene (either the coding regions orregulatory regions of the target gene) can be used, with or without aselectable marker and/or a negative selectable marker, to transfectcells that express the target gene in vivo. Insertion of the DNAconstruct, via targeted homologous recombination, results ininactivation of the target gene. Such approaches are particularly suitedin the agricultural field where modifications to ES (embryonic stem)cells can be used to generate animal offspring with an inactive targetgene (e.g., see Thomas and Capecchi, 1987 and Thompson, 1989, supra).However this approach can be adapted for use in humans provided therecombinant DNA constructs are directly administered or targeted to therequired site in vivo using appropriate viral vectors.

[0404] Alternatively, endogenous target gene expression can be reducedby targeting deoxyribonucleotide sequences complementary to theregulatory region of the target gene (i.e., the target gene promoterand/or enhancers) to form triple helical structures that preventtranscription of the target gene in target cells in the body. (Seegenerally, Helene, 1991, Anticancer Drug Des., 6(6):569-584; Helene, etal., 1992, Ann. N.Y. Acad. Sci., 660:27-36; and Maher, 1992, Bioassays14(12):807-815).

[0405] Nucleic acid molecules to be used in triplex helix formation forthe inhibition of transcription should be single stranded and composedof deoxynucleotides. The base composition of these oligonucleotides mustbe designed to promote triple helix formation via Hoogsteen base pairingrules, which generally require sizeable stretches of either purines orpyrimidines to be present on one strand of a duplex. Nucleotidesequences may be pyrimidine-based, which will result in TAT and CGC+triplets across the three associated strands of the resulting triplehelix. The pyrimidine-rich molecules provide base complementarity to apurine-rich region of a single strand of the duplex in a parallelorientation to that strand. In addition, nucleic acid molecules may bechosen that are purine-rich, for example, contain a stretch of Gresidues. These molecules will form a triple helix with a DNA duplexthat is rich in GC pairs, in which the majority of the purine residuesare located on a single strand of the targeted duplex, resulting in GGCtriplets across the three strands in the triplex.

[0406] Alternatively, the potential sequences that can be targeted fortriple helix formation may be increased by creating a so called“switchback” nucleic acid molecule. Switchback molecules are synthesizedin an alternating 5′-3′, 3′-5′ manner, such that they base pair withfirst one strand of a duplex and then the other, eliminating thenecessity for a sizeable stretch of either purines or pyrimidines to bepresent on one strand of a duplex.

[0407] In instances wherein the antisense, ribozyme, and/or triple helixmolecules described herein are utilized to inhibit mutant geneexpression, it is possible that the technique may so efficiently reduceor inhibit the transcription (triple helix) and/or translation(antisense, ribozyme) of mRNA produced by normal target gene allelesthat the possibility may arise wherein the concentration of normaltarget gene product present may be lower than is necessary for a normalphenotype. In such cases, to ensure that substantially normal levels oftarget gene activity are maintained, therefore, nucleic acid moleculesthat encode and express target gene polypeptides exhibiting normaltarget gene activity may be introduced into cells via gene therapymethods such as those described, below, in Section 5.9.2 that do notcontain sequences susceptible to whatever antisense, ribozyme, or triplehelix treatments are being utilized. Alternatively, in instances wherebythe target gene encodes an extracellular protein, it may be preferableto co-administer normal target gene protein in order to maintain therequisite level of target gene activity.

[0408] Anti-sense RNA and DNA, ribozyme, and triple helix molecules ofthe invention may be prepared by any method known in the art for thesynthesis of DNA and RNA molecules, as discussed above. These includetechniques for chemically synthesizing oligodeoxyribonucleotides andoligoribonucleotides well known in the art such as for example solidphase phosphoramidite chemical synthesis. Alternatively, RNA moleculesmay be generated by in vitro and in vivo transcription of DNA sequencesencoding the antisense RNA molecule. Such DNA sequences may beincorporated into a wide variety of vectors that incorporate suitableRNA polymerase promoters such as the T7 or SP6 polymerase promoters.Alternatively, antisense cDNA constructs that synthesize antisense RNAconstitutively or inducibly, depending on the promoter used, can beintroduced stably into cell lines.

5.8.2. GENE REPLACEMENT THERAPY

[0409] Nucleic acid sequences such as the HKNG1, GNKH and TS genenucleic acid sequences described, above, in Section 5. 1, can beutilized for transferring recombinant HKNG1, GNKH and/or TS nucleic acidsequences to cells and expressing said sequences in recipient cells.Such techniques can be used, for example, in marking cells or for thetreatment of a disorder, such as a neuropsychiatric disorder (e.g., BADor schizophrenia) mediated by HKNG1, GNKH or TS. Such treatment can bein the form of gene replacement therapy. Specifically, one or morecopies of a normal HKNG1, GNKH and/or TS gene, or a portion of a HKNG1,GNKH or TS gene that directs the production of a gene product exhibitingnormal function (i.e., normal HKNG1, GNKH or TS gene product function)can be inserted into the appropriate cells within a patient, e.g., usingvectors that include, but are not limited to, adenovirus,adeno-associated virus and retrovirus vectors, in addition to otherparticular carriers, such as liposomes, that introduce DNA into cells.

[0410] Such gene replacement therapy techniques are preferably capableof delivering HKNG1, GNKH and/or TS gene sequences to the cell or tissuetypes within patients that normally express HKNG1, GNKH or TS, such aslung, trachea, kidney, pancreas, prostrate, testis, ovary, stomach,intestine, thyroid, lymph node, spinal chord and, in particular, brain;including, e.g., the cerebellum, cerebral cortex, medulla, occipitalpole, frontal lobe, temporal lobe, putamen, amygdala, caudate nucleus,corpus callosum, hippocampus and substantia nigra. In one embodiment,techniques that are well known to those of skill in the art (see, e.g.,PCT Publication No. WO 89/10134, published Apr. 25, 1988) can readily beused to enable HKNG1, GNKH and/or TS gene sequences to cross theblood-brain barrier and, thus, to deliver the sequences to cells in thebrain. With respect to delivery that is capable of crossing theblood-brain barrier, viral vectors such as, for example, those describedabove, are preferable.

[0411] In another embodiment, techniques for delivery involve directadministration, e.g., by stereotactic delivery of such HKNG1, GNKHand/or TS gene sequence to the site of the cells in which the HKNG1,GNKH and/or TS gene sequences are to be expressed.

[0412] Additional methods that may be utilized to increase the overalllevel of HKNG1, GNKH or TS gene expression and/or HKNG1, GNKH or TS geneproduct activity include using targeted homologous recombinationmethods, such as those discussed in Section 5.2, above, to modify theexpression characteristics of an endogenous HKNG1, GNKH or TS gene in acell or microorganism by inserting a heterologous DNA regulatory elementsuch that the inserted regulatory element is operatively linked with theendogenous HKNG1, GNKH or TS gene in question. Targeted homologousrecombination can thus be used to activate transcription of anendogenous gene, such as an endogenous HKNG1, GNKH or TS gene, that is“transcriptionally silent”, i.e., is not normally expressed or isnormally expressed at very low levels, or to enhance the expression ofan endogenous gene, such as an endogenous HKNG1, GNKH or TS gene, thatis normally expressed.

[0413] The overall level of expression or activity in a patient of agene or gene product of the present invention (i.e., a HKNG1 gene orgene product, a GNKH gene or gene product, or a TS gene or gene product)can also be increased by introducing appropriate HKNG1-, GNKH- orTS-expressing cells, preferably autologous cells, into the patient atpositions and in numbers that are sufficient to ameliorate the symptomsof a disorder (e.g., a neuropsychiatric disorder such as BAD orschizophrenia) mediated by HKNG1, GNKH or TS. Such cells can be eitherrecombinant or non-recombinant cells.

[0414] Among the cells that can be administered to increase the overalllevel of HKNG1, GNKH or TS gene expression in a patient are normalcells, preferably brain cells, that express the HKNG1, GNKH or TS gene.Alternatively, cells, preferably autologous cells, can be engineered toexpress HKNG1, GNKH and/or TS gene sequences, and may then be introducedinto a patient in positions appropriate for the amelioration of thesymptoms of disorder, e.g., a neuropsychiatric disorder, mediated byHKNG1, GNKH or TS. Cells that express an unimpaired HKNG1, GNKH or TSgene and are from a MHC matched individual can also be utilized. Suchcells can include, for example, brain cells as well as other cell typesthat express HKNG1, GNKH or TS.

[0415] The expression of the HKNG1, GNKH and/or TS gene sequences ispreferably controlled in the cells by gene regulatory sequences whichallow such expression of HKNG1, GNKH and/or TS in the necessary celltypes. Such gene regulatory sequences are well known to the skilledartisan. Such cell-based gene therapy techniques are well known to thoseskilled in the art, see, e.g., Anderson, U.S. Pat. No. 5,399,346.

[0416] When the cells to be administered are non-autologous cells, theycan be administered using well known techniques that prevent a hostimmune response against the introduced cells from developing. Forexample, the cells may be introduced in an encapsulated form which,while allowing for an exchange of components with the immediateextracellular environment, does not allow the introduced cells to berecognized by the host immune system.

[0417] Additionally, compounds, such as those identified via techniquessuch as those described, above, in Section 5.8, that are capable ofmodulating HKNG1, GNKH and/or TS gene product activity can beadministered using standard techniques that are well known to those ofskill in the art. In instances in which the compounds to be administeredare to involve an interaction with brain cells, the administrationtechniques should include well known ones that allow for a crossing ofthe blood-brain barrier.

5.8.3. PHARMACOGENOMICS

[0418] Agents or modulators which have a stimulatory or inhibitoryeffect on activity or expression of a polypeptide of the invention asidentified by a screening assay described herein can be administered toindividuals to treat (prophylactically or therapeutically) disordersassociated, e.g., aberrant activity of the polypeptide. In conjunctionwith such treatment, the pharmacogenomics (i.e., the study of therelationship between an individual's genotype and that individual'sresponse to a foreign compound or drug) of the individual may beconsidered. Differences in metabolism of therapeutics can lead to severetoxicity or therapeutic failure by altering the relation between doseand blood concentration of the pharmacologically active drug. Thus, thepharmacogenomics of the individual permits the selection of effectiveagents (e.g., drugs) for prophylactic or therapeutic treatments based ona consideration of the individual's genotype. Such pharmacogenomics canfurther be used to determine appropriate dosages and therapeuticregimens. Accordingly, the activity of a polypeptide of the invention,expression of a nucleic acid of the invention or mutation content of agene of the invention in an individual can be determined to therebyselect an appropriate agent or appropriate agents for therapeutic orprophylactic treatment of the individual.

[0419] Pharmacogenomics deals with clinically significant hereditaryvariations in the response to drugs due to altered drug disposition andabnormal action in affected persons. See, e.g., Linder, 1997, Clin.Chem. 43:254-266. In general, two types of pharmacogenetic conditionscan be differentiated. Genetic conditions transmitted as a single factoraltering the way drugs act on the body are referred to as “altered drugaction.” Genetic conditions transmitted as single factors altering theway the body acts on drugs are referred to as “altered drug metabolism.”These pharmacogenetic conditions can occur either as rare defects or aspolymorphisms. For example, and not by way of limitation,glucose-6-phosphate dehydrogenase deficiency (G6PD) is a commoninherited enzymopathy in which the main clinical complication ishaemolysis after ingestion of oxidant drugs (anti-malarials,sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[0420] As an exemplary, non-limiting embodiment, the activity of drugmetabolizing enzymes is a major determinant of both the intensity andduration of drug action. The discovery of genetic polymorphisms of drugmetabolizing enzymes, such as N-acetyltransferase 2 (NAT 2) and thecytochrome P452 enzymes CYP2D6 and CYP2C19, has provided an explanationas to why some patients do not obtain expected drug effects or showexaggerated drug response and serious toxicity after taking the standardand ordinarily safe dose of a drug. These polymorphisms are typicallyexpressed in two phenotypes of the population, the extensive metabolizer(EM) and the poor metabolizer (PM). The prevalence of PM is differentamong different populations. For example, the gene coding for CYP2D6 ishighly polymorphic and several mutations have been identified in PMphenotypes, all of which lead to the absence of functional CYP2D6. Poormetabolizers of CYP2D6 and CYP2C19 quite frequently experienceexaggerated drug response and side effects when they will receivestandard doses. If a metabolite is the active therapeutic moiety, a PMwill show no therapeutic response, as demonstrated for the analgesiceffect of codeine mediated by its CYP2D6-formed metabolite morphine. Theother extreme are the so called ultra-rapid metabolizers who do notrespond to standard doses. Recently, the molecular basis of ultra-rapidmetabolism has been identified to be due to CYP2D6 gene amplification.

[0421] Thus, the activity of a polypeptide of the invention, expressionof a nucleic acid encoding the polypeptide, or mutation content of agene encoding the polypeptide in an individual can be determined tothereby select an appropriate agent or appropriate agents for treatmentof the individual, including therapeutic or prophylactic treatment ofthe individual. In addition, pharmacogenetic studies can be used toapply genotyping of polymorphic alleles encoding drug-metabolizingenzymes to the identification of an individual's drug responsivenessphenotype. This knowledge, when applied to dosing or drug selection, canavoid adverse reactions or therapeutic failure and thus enhancetherapeutic or prophylactic efficiency when treating a subject with amodulator of activity or expression of the polypeptide, such as amodulator identified by one of the exemplary screening assays describedherein.

5.8.4. MONITORING EFFECTS DURING CLINICAL TRIALS

[0422] Monitoring the influence of agents (e.g., drugs and othercompounds) on the expression or activity of a polypeptide of theinvention (e.g., the ability to modulate aberrant cell proliferationchemotaxis and/or differentiation) can be applied, not only in basicdrug screening, but also in clinical trials. For example, theeffectiveness of an agent, as determined by a screening assay describedherein, to increase gene express, protein levels or protein activity,can be monitored in clinical trials of subjects exhibiting decreasedgene expression, protein levels, or protein activity. Alternatively, theeffectiveness of an agent, as determined by a screening assay, todecrease gene expression, protein levels or protein activity, can bemonitored in clinical trials of subjects exhibiting increased geneexpression, protein levels or protein activity. In such clinical trials,expression or activity of a gene or polypeptide of the invention and,preferably, that of other genes or polypeptides that have beenimplicated, for example, in a neuropsychiatric disorder, can be used asa marker of the effectiveness of the agent or therapy.

[0423] For example, and not by way of limitation, genes, including thoseof the invention, that are modulated in cells by treatment with an agent(e.g., a compound such as a drug or other small molecule) whichmodulates activity or expression of a gene or polynucleotide of theinvention (e.g., such as a compound identified in one of theabove-described screening assays) can be readily identified by thoseskilled in the art. Thus, to study the effect of agents onneuropsychiatric disorders, for example, in a clinical trial, cells canbe isolated and RNA prepared and analyzed for the levels of expressionof a gene of the invention and for levels of expression of other genesimplicated in a neuropsychiatric disorders. The levels of geneexpression (i.e., a gene expression pattern) can be qualified, forexample, by Northern blot analysis or using RT-PCR, as described herein,or, alternatively, by measuring the amount of protein produced, e.g.,using any of the methods described herein, or by measuring the levels ofactivity of a gene or gene product of the invention or of other genes orgene products, particularly other genes or gene products associated withsimilar disorders (e.g., other genes or gene products associated withneuropsychiatric disorders such as BAD). In this way, the geneexpression pattern can serve as a marker, indicative of thephysiological response of the cells to the agent. Accordingly, theresponse state may be determined before, at various points during, andafter the treatment of the individual.

[0424] In a preferred embodiment, the present invention provides amethod for monitoring the effectiveness of treatment of a subject withone or more agents (e.g., agonists, antagonists, peptidomimetic,protein, peptide, nucleic acid, small molecule or other drug candidateidentified by the screening assays described herein) comprising thesteps of: (i) obtaining a pre-administration sample from a subject priorto administration of the agent; (ii) detecting the level of thepolypeptide or nucleic acid of the invention in the preadministrationsample; (iii) obtaining one or more post-administration sample from thesubject; (iv) detecting the level of the polypeptide or nucleic acid ofthe invention in the post-administration samples; (v) comparing thelevel of the polypeptide or nucleic acid of the invention in thepost-administration sample or samples; and (vi) altering theadministration of the agent to the subject accordingly. For example,increased administration of the agent may be desirable to increase theexpression or activity of the polypeptide to higher levels thandetected, i.e., to increase the effectiveness of the agent.Alternatively, decreased administration of the agent may be desirable todecrease expression or activity of the polypeptide to lower levels thandetected, i.e., to decrease the effectiveness of the agent.

5.9. PHARMACEUTICAL PREPARATIONS AND METHODS OF ADMINISTRATION

[0425] The compounds, such as those described in the preceding sectionsabove, that are determined to affect HKNG1, GNKH or TS gene expressionor gene product activity can be administered to a patient attherapeutically effective doses to treat or ameliorate a disorder, suchas a neuropsychiatric or other disorder described herein, mediated by aHKNG1 gene or gene product, to treat or ameliorate a disorder, such as aneuropsychiatric disorder or other disorder described herein, mediatedby a GNKH gene or gene product, or to treat or ameliorate a disorder,such as a neuropsychiatric disorder or other disorder described herein,mediated by a TS gene or gene product. A therapeutically effective doserefers to that amount of the compound sufficient to result inamelioration of symptoms of such a disorder. Such doses are described,in detail, in Section 5.8.1, below. Formulations of such pharmaceuticalcompositions, as well as method of their use and administrations, aredescribed in Section 5.8.2.

5.9.1. EFFECTIVE DOSE

[0426] As defined herein, a therapeutically effective amount ofantibody, protein, or polypeptide (i.e., an effective dosage) rangesfrom about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight,and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The skilled artisanwill appreciate that certain factors may influence the dosage requiredto effectively treat a subject, including but not limited to theseverity of the disease or disorder, previous treatments, the generalhealth and/or age of the subject, and other diseases present. Moreover,treatment of a subject with a therapeutically effective amount of aprotein, polypeptide, or antibody can include a single treatment or,preferably, can include a series of treatments. In a preferred example,a subject is treated with antibody, protein, or polypeptide in the rangeof between about 0.1 to 20 mg/kg body weight, one time per week forbetween about 1 to 10 weeks, preferably between 2 to 8 weeks, morepreferably between about 3 to 7 weeks, and even more preferably forabout 4, 5, or 6 weeks. It will also be appreciated that the effectivedosage of antibody, protein, or polypeptide used for treatment mayincrease or decrease over the course of a particular treatment. Changesin dosage may result and become apparent from the results of diagnosticassays as described herein.

[0427] The present invention encompasses agents which modulateexpression or activity. An agent may, for example, be a small molecule.For example, such small molecules include, but are not limited to,peptides, peptidomimetics, amino acids, amino acid analogs,polynucleotides, polynucleotide analogs, nucleotides, nucleotideanalogs, organic and inorganic compounds (including, e.g., heteroorganicand organometallic compounds) having a molecular weight less than about10,000 grams per mole, organic or inorganic compounds having a molecularweight less than about 5,000 grams per mole, organic or inorganiccompounds having a molecular weight less than about 1,000 grams permole, organic or inorganic compounds having a molecular weight less thanabout 500 grams per mole, and salts, esters and other pharmaceuticallyacceptable forms of such compounds.

[0428] It is understood that appropriate doses of small molecule agentsdepends upon a number of factors with the ken of the ordinarily skilledphysician, veterinarian or researcher. For example, the dose of a smallmolecules used in the methods of the invention can vary depending uponthe identity, size and conditions of the subject or sample being treatedas well as upon the route by which the composition is to beadministered, and the effect which the practitioner desires the smallmolecule to have upon the nucleic acid or polypeptide of the invention.Exemplary doses include milligram or microgram amounts of the smallmolecule per kilogram of subject or sample weight (for example, about 1microgram per kilogram to about 500 milligrams per kilogram, about 100micrograms per kilogram to about 5 milligrams per kilogram, or about 1microgram per kilogram to about 50 micrograms per kilogram). It isfurther understood that appropriate doses of small molecule depend uponthe potency of the small molecule with respect to the expression oractivity to be modulated. Such appropriate doses may be readilydetermined, e.g., using the assays described herein.

[0429] As an example, and not by way of limitation, when one or moresmall molecules is to be administered to a subject (e.g., a human orother animal) in order to modulate expression or activity of apolypeptide or nucleic acid of the invention, a physician, veterinarianor researcher may, for example, prescribe a relatively low dose at firstand, subsequently, increase the dose until an appropriate response isobtained. In addition, it is understood that the specific dose level forany particular animal subject will depend upon a variety of factorsincluding, for example, the activity of the specific compound employed,the age, body weight, general health, gender and diet of the subject,the time of administration, the route of administration, the rate ofexcretion, any drug combinations also being administered to the subject,and the degree of gene or gene product expression or activity to bemodulated.

[0430] Toxicity and therapeutic efficacy of such compounds can bedetermined by standard pharmaceutical procedures in cell cultures orexperimental animals, e.g., for determining the LD50 (the dose lethal to50% of the population) and the ED50 (the dose therapeutically effectivein 50% of the population). The dose ratio between toxic and therapeuticeffects is the therapeutic index and it can be expressed as the ratioLD50/ED50. Compounds that exhibit large therapeutic indices arepreferred. While compounds that exhibit toxic side effects may be used,care should be taken to design a delivery system that targets suchcompounds to the site of affected tissue in order to minimize potentialdamage to uninfected cells and, thereby, reduce side effects.

[0431] The data obtained from the cell culture assays and animal studiescan be used in formulating a range of dosage for use in humans. Thedosage of such compounds lies preferably within a range of circulatingconcentrations that include the ED50 with little or no toxicity. Thedosage may vary within this range depending upon the dosage formemployed and the route of administration utilized. For any compound usedin the method of the invention, the therapeutically effective dose canbe estimated initially from cell culture assays. A dose may beformulated in animal models to achieve a circulating plasmaconcentration range that includes the IC50 (i.e., the concentration ofthe test compound that achieves a half-maximal inhibition of symptoms)as determined in cell culture. Such information can be used to moreaccurately determine useful doses in humans. Levels in plasma may bemeasured, for example, by high performance liquid chromatography.

5.9.2. FORMULATIONS AND USE

[0432] Pharmaceutical compositions for use in accordance with thepresent invention may be formulated in conventional manner using one ormore physiologically acceptable carriers or excipients.

[0433] Thus, the compounds and their physiologically acceptable saltsand solvates may be formulated for administration by inhalation orinsufflation (either through the mouth or the nose) or oral, buccal,parenteral, rectal or topical administration.

[0434] For oral administration, the pharmaceutical compositions may takethe form of, for example, tablets or capsules prepared by conventionalmeans with pharmaceutically acceptable excipients such as binding agents(e.g., pregelatinised maize starch, polyvinylpyrrolidone orhydroxypropyl methylcellulose); fillers (e.g., lactose, microcrystallinecellulose or calcium hydrogen phosphate); lubricants (e.g., magnesiumstearate, talc or silica); disintegrants (e.g., potato starch or sodiumstarch glycolate); or wetting agents (e.g., sodium lauryl sulfate). Thetablets may be coated by methods well known in the art. Liquidpreparations for oral administration may take the form of, for example,solutions, syrups or suspensions, or they may be presented as a dryproduct for constitution with water or other suitable vehicle beforeuse. Such liquid preparations may be prepared by conventional means withpharmaceutically acceptable additives such as suspending agents (e.g.,sorbitol syrup, cellulose derivatives or hydrogenated edible fats);emulsifying agents (e.g., lecithin or acacia); non-aqueous vehicles(e.g., almond oil, oily esters, ethyl alcohol or fractionated vegetableoils); and preservatives (e.g., methyl or propyl-p-hydroxybenzoates orsorbic acid). The preparations may also contain buffer salts, flavoring,coloring and sweetening agents as appropriate.

[0435] Preparations for oral administration may be suitably formulatedto give controlled release of the active compound.

[0436] For buccal administration the compositions may take the form oftablets or lozenges formulated in conventional manner.

[0437] For administration by inhalation, the compounds for use accordingto the present invention are conveniently delivered in the form of anaerosol spray presentation from pressurized packs or a nebulizer, withthe use of a suitable propellant, e.g., dichlorodifluoromethane,trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide orother suitable gas. In the case of a pressurized aerosol the dosage unitmay be determined by providing a valve to deliver a metered amount.Capsules and cartridges of e.g., gelatin for use in an inhaler orinsufflator may be formulated containing a powder mix of the compoundand a suitable powder base such as lactose or starch.

[0438] The compounds may be formulated for parenteral administration byinjection, e.g., by bolus injection or continuous infusion. Formulationsfor injection may be presented in unit dosage form, e.g., in ampoules orin multi-dose containers, with an added preservative. The compositionsmay take such forms as suspensions, solutions or emulsions in oily oraqueous vehicles, and may contain formulatory agents such as suspending,stabilizing and/or dispersing agents. Alternatively, the activeingredient may be in powder form for constitution with a suitablevehicle, e.g., sterile pyrogen-free water, before use.

[0439] The compounds may also be formulated in rectal compositions suchas suppositories or retention enemas, e.g., containing conventionalsuppository bases such as cocoa butter or other glycerides.

[0440] In certain embodiments, it may be desirable to administer thepharmaceutical compositions of the invention locally to the area in needof treatment. This may be achieved by, for example, and not by way oflimitation, local infusion during surgery, topical application, e.g., inconjunction with a wound dressing after surgery, by injection, by meansof a catheter, by means of a suppository, or by means of an implant,said implant being of a porous, non-porous, or gelatinous material,including membranes, such as sialastic membranes, or fibers. In oneembodiment, administration can be by direct injection at the site (orformer site) of a malignant tumor or neoplastic or pre-neoplastictissue.

[0441] For topical application, the compounds may be combined with acarrier so that an effective dosage is delivered, based on the desiredactivity.

[0442] A topical formulation for treatment of some of the eye disordersdiscussed infra (e.g., myopia) consists of an effective amount of thecompounds in a ophthalmologically acceptable excipient such as bufferedsaline, mineral oil, vegetable oils such as corn or arachis oil,petroleum jelly, Miglyol 182, alcohol solutions, or liposomes orliposome-like products. Any of these compositions may also includepreservatives, antioxidants, antibiotics, immunosuppressants, and otherbiologically or pharmaceutically effective agents which do not exert adetrimental effect on the compound.

[0443] In addition to the formulations described previously, thecompounds may also be formulated as a depot preparation. Such longacting formulations may be administered by implantation (for examplesubcutaneously or intramuscularly) or by intramuscular injection. Thus,for example, the compounds may be formulated with suitable polymeric orhydrophobic materials (for example as an emulsion in an acceptable oil)or ion exchange resins, or as sparingly soluble derivatives, forexample, as a sparingly soluble salt.

[0444] The compositions may, if desired, be presented in a pack ordispenser device that may contain one or more unit dosage formscontaining the active ingredient. The pack may for example comprisemetal or plastic foil, such as a blister pack. The pack or dispenserdevice may be accompanied by instructions for administration.

6. EXAMPLE The HKNG1 Gene of Chromosome 18 is Associated With theNeuropsychiatric Disorder Bad

[0445] In the Example presented in this Section, studies are describedthat define a narrow interval of approximately 27 kb on the short arm ofhuman chromosome 18 which is associated with the neuropsychiatricdisorder BAD. The interval is demonstrated to lie within the genereferred to herein as the HKNG1 gene.

6.1. MATERIALS AND METHODS

[0446] Linkage Disequilibrium:

[0447] Linkage disequilibrium (LD) studies were performed using DNA froma population sample of neuropsychiatric disorder (BP-I) patients. Thepopulation sample and LD techniques were as described in Escamilla etal., 1996, Am J. Med. Genet. 67:244-253. The present LD study tookadvantage of the additional population sample collection and theadditional physical markers identified via the physical mappingtechniques described below.

[0448] Yeast Artificial Chromosome (SAC) Mapping:

[0449] For physical mapping, yeast artificial chromosomes (YACs)containing human sequences were mapped to the region being analyzedbased on publicly available maps (Cohen et al., 1993, C.R. Acad. Sci.316:1484-1488). The YACs were then ordered and contig reconstructed byperforming standard sequence tagged site (STS)-content mapping withmicrosatellite markers and non-polymorphic STSs available from databasesthat surround the genetically defmed candidate region.

[0450] Bacterial Artificial Chromosome (BAC) Mapping:

[0451] STSs from the short arm of human chromosome 18 were used toscreen a human BAC library (Research Genetics, Huntsville, Ala.). Theends of the BACs were cloned or directly sequenced. The end sequenceswere used to amplify the next overlapping BACs. From each BAC,additional microsatellites were identified. Specifically, random shearedlibraries were prepared from overlapping BACs within the defmed geneticinterval. BAC DNA was sheared with a nebulizer (CIS-US Inc., Bedford,Mass.). Fragments in the size range of 600 to 1,000 bp were utilized forthe sublibrary production. Microsatellite sequences from thesublibraries were identified by corresponding microsatellite probes.Sequences around such repeats were obtained to enable development of PCRprimers for genomic DNA.

[0452] Radiation Hybrid (RH) Mapping:

[0453] Standard RH mapping techniques were applied to a Stanford G3 RHmapping panel (Research Genetics, Huntsville, Ala.) to order allmicrosatellite markers and non-polymorphic STSs in the region beinganalyzed.

[0454] Sample Sequencing:

[0455] Random sheared libraries were made from all the BACs within thedefined genetic region. Approximately 9,000 subclones within theapproximately 340 kb region containing the BAD interval were sequencedwith vector primers in order to achieve an 8-fold sequence coverage ofthe region. All sequences were processed through an automated sequenceanalysis pipeline that assessed quality, removed vector sequences andmasked repetitive sequences. The resulting sequences were then comparedto public DNA and protein databases using BLAST algorithms (Altschul, etal., 1990, J. Mol. Biol. 215:403-410).

[0456] All sequences were contiged using Sequencher 3.0 (Gene CodesCorp.) and PHRED and PHRAP (Phil Green, Washington University) into asingle DNA fragment of 340 kb.

6.2. RESULTS

[0457] Genetic regions involved in bipolar affective disorder (BAD)human genes had previously been reported to map to portions of the long(18q) and short (18p) arms of human chromosome 18 (Freimer et al., 1996,Neuropsychiat. Genet. 67:254-263; Freimer et al., 1996, Nature Genetics12:436-441; and McInnis et al., 1996, Proc. Natl. Acad. Sci. U.S.A.93:13060-13065).

[0458] High Resolution Physical Mapping Using YAC, BAC and RHTechniques:

[0459] In order to provide the precise order of genetic markersnecessary for linkage and LD mapping, and to guide new microsatellitemarker development for finer mapping, a high resolution physical map ofthe 18p candidate region was developed using YAC, BAC and RH techniques.

[0460] For such physical mapping, first, YACs were mapped to thechromosome 18 region being analyzed. Using the mapped YAC contig as aframework, the region from publicly available markers spanning the 18pregion were also mapped and contiged with BACs. Sublibraries from thecontiged BACs were constructed, from which microsatellite markersequences were identified and sequenced.

[0461] To ensure development of an accurate physical map, the radiationhybrid (RH) mapping technique was independently applied to the regionbeing analyzed. RH was used to order all microsatellite markers andnon-polymorphic STSs in the region. Thus, the high resolution physicalmap ultimately constructed was obtained using data from RH mapping andSTS-content mapping.

[0462] Linkage Disequilibrium:

[0463] Prior to attempting to identify gene sequences, studies wereperformed to further narrow the neuropsychiatric disorder region.Specifically, a linkage disequilibrium (LD) analysis was performed usingpopulation samples and techniques as described in Section 6.1, above,which took advantage of the additional physical markers identified viathe physical mapping techniques described below.

[0464] Initial LD analysis narrowed the interval which associates withBAD disorders to a 340 kb region of 18p. BAC clones within this newlyidentified neuropsychiatric disorder region were analyzed to identifyspecific genes within the region. A combination of sample sequencing,cDNA selection and transcription mapping analyses were used to arrangesequences into tentative transcription units, that is, tentativelydelineating the coding sequences of genes within this genomic region ofinterest.

[0465] Subsequent LD analyses further narrowed the BAD region of 18p toa narrow interval of approximately 27 kb. This was accomplished byidentifying the maximum haplotype shared among affected individualsusing additional markers. Statistical analysis of the entire 18pcandidate region indicated that the 27 kb haplotype was significantlyelevated in frequency among affected Costa Rican individuals (LOD=2.2;p=0.0005).

[0466] This newly identified narrow interval was found to map completelywithin one of the transcription units identified as described above. Thegene corresponding to this transcription unit is referred to herein asthe HKNG1 gene. Thus, the results of the mapping analyses presented inthis Section demonstrate that the HKNG1 gene of human chromosome 18 isassociated the neuropsychiatric disorder BAD.

[0467] Analysis of the BAD interval indicated that the 27 kb BADdisease-associated chromosomal interval identified in the linkagedisequilibrium studies is contained within an approximately 60 kbgenomic region which contains a sequence referred to as GS4642 or rodphotoreceptor protein (RPP) gene (Shimizu-Matsumoto, A. et al., 1997,Invest. Ophthalmol. Vis. Sci. 38:2576-2585).

7. EXAMPLE Sequence and Characterization of the HKNG1 Gene

[0468] As demonstrated in the Example presented in Section 6, above, theHKNG1 gene is involved in the neuropsychiatric disorder BAD. The resultspresented in this Section further characterize the HKNG1 gene and geneproduct. In particular, isolation of additional cDNA clones and analysesof genomic and cDNA sequences have revealed both the full length HKNG1amino acid sequence and the HKNG1 genomic intron/exon structure. Inparticular, the nucleotide and predicted amino acid sequence of theHKNG1 gene identified by these analyses disclose new HKNG1 exonsequences, including new HKNG1 protein coding sequence, discoveredherein. Further, the expression of HKNG1 in human tissue, especiallyneural tissue, is characterized by Northern and in situ hybridizationanalysis. The results presented herein are consistent with the HKNG1gene being a gene which mediates neuropsychiatric disorders such as BAD.

7.1. MATERIALS AND METHODS

[0469] HKNG1 cDNA Clone Isolation:

[0470] Hybridization of a human brain and kidney cDNA library wasperformed according to standard techniques and identified a full-lengthHKNG1 cDNA clone. In addition, a HKNG1 cDNA derived from a splicevariant was isolated, as described in Section 7.2, below.

[0471] Northern Blot Analysis:

[0472] Standard RNA isolation techniques and Northern blottingprocedures were followed. The HKNG1 probe utilized corresponds to thecomplementary sequence of base pairs 1367 to 1578 of the full lengthHKNG1 cDNA sequence (SEQ ID NO. 1). Clontech multiple tissue northernblots were probed. In particular, Clontech human I, human II, human III,human fetal II, human brain II and human brain III blots were utilizedfor this study.

[0473] In Situ Hybridization Analysis:

[0474] Standard in situ hybridization techniques were utilized. TheHKNG1 probe utilized corresponds to the complementary sequence of basepairs 910 to 1422 of the full length HKNG1 cDNA sequence (SEQ ID NO. 1).Brains for in situ hybridization analysis were obtained from McLeanHospital (The Harvard Brain Tissue Resource Center, Belmont, Mass.02178).

[0475] Other Techniques:

[0476] The remaining techniques described in Section 7.2, below, wereperformed according to standard techniques or as discussed in Section6.1, above.

7.2. RESULTS

[0477] HKNG1 Nucleotide and Amino Acid Sequence:

[0478] A human brain cDNA library was screened and a full-length cloneof HKNG1 was isolated from this library, as described above. Bycomparing the isolated cDNA sequence to sequences in the publicdatabases, a clone was identified which had been previously identifiedas GS4642, or rod photoreceptor protein (RPP) gene (GenBank AccessionNo. D63813; Shimizu-Matsumoto, A. et al., 1997, Invest. Ophthalmol. Vis.Sci. 38:2576-2585). Although Shimizu-Matsumoto et al. refer to GS4642 asa full-length cDNA sequence, the isolated HKNG1 cDNA extendsapproximately 200 bp beyond the 5′end of the identified GS4642 clone.

[0479] Importantly, the HKNG1 clone isolated herein reveals that,contrary to the amino acid sequence described in Shimizu-Matsumoto etal., the full length HKNG1 amino acid sequence contains an additional 29amino acid residues N-terminal to what had previously been identified asthe full-length RPP (SEQ ID NO:64). The full-length HKNG1 nucleotidesequence (SEQ ID NO: 1) and the derived amino acid sequence of thefull-length HKNG1 polypeptide (SEQ ID NO: 2) encoded by this sequenceare depicted in FIGS. 1A-1C.

[0480] The full-length HKNG1 polypeptide was found to contain twoclusterin similarity domains: clusterin similarity domain 1 (SEQ IDNO:125) which corresponds to amino acid residues 134 to amino acidresidue 160 of the full-length HKNG1 polypeptide sequence (SEQ ID NO:2),and clusterin similarity domain 2 (SEQ ID NO:125) which corresponds toamino acid residue 334 to amino acid residue 362 of the full lengthHKNG1 polypeptide sequence (SEQ ID NO:2). Such cluterin domains aretypically characterized by five shared cysteine residues. In clusterindomain 1, these shared cysteine residues correspond to Cys 134, Cys145,Cys148, Cys153, and Cys 160. The shared cysteine residues in clusterindomain 2 correspond to the residues Cys334, Cys344, Cys351, Cys354, andCys362.

[0481] Full-length HKNG1 cDNA sequence was compared with the genomiccontig completed by random sheared library sequencing. Exon-intronboundaries were identified manually by aligning the two sequences inSequencher 3.0 and by observing the conservative splicing sites wherethe alignments ended. This sequence comparison revealed that theadditional cDNA sequence discovered through isolation of the full-lengthHKNG1 cDNA clone actually belongs within three HKNG1 exons.

[0482] Prior to the isolation and analysis of HKNG1 cDNA describedherein, nine exons were predicted to be present within the correspondinggenomic sequence. As discovered herein, however, the HKNG1 gene, incontrast, actually contains 13 exons, with the new cDNA containingsequence which corresponds to a new exon 1, exon 2 and a 5′ extension ofwhat had previously been designated exon 1. Splice variants, discussedin Section 9 below, also exist which comprise additional exons 2′ and2″. The genomic sequence and intron/exon structure of the HKNG1 gene isshown in FIG. 3A-3A-28.

[0483] The breakdown of exons was confirmed by the perfect alignment ofthe cDNA sequence with the genomic sequence and by observation ofexpected splicing sites flanking each of the additional, newlydiscovered exons.

[0484] HKNG1 nucleotide sequence was used to search databases of partialsequences of cDNA clones. This search identified a partial cDNA sequencederived from IMAGE clone 37892 (GenBank Accession No. R61493) havingsimilarity to the human HKNG1 sequence. IMAGE clone R61493 was obtainedand consists of a cDNA insert, the Lafmid BA vector backbone, and DNAoriginating from the oligo dT primer and Hind III adaptors used in cDNAlibrary construction. The Lafmid BA vector nucleotide sequence isavailable at the URL http://image.rzpd.de/lafmida_seq.html anddescriptions of the oligo dT primer and Hind III adaptors are availablein the GENBANK record corresponding to accession number R61493.

[0485] The sequence of the cDNA insert revealed that the insert wasderived from an alternatively spliced HKNG1 mRNA variant, referred toherein as HKNG1-V 1. In particular, this HKNG1 variant is deleted forexon 3 of the full length 13 exon HKNG1 sequence. The nucleotidesequence of this HKNG1 variant (SEQ ID NO:3) is depicted in FIG. 2A-C.The amino acid sequence encoded by the HKNG1 variant (SEQ ID NO:3) isalso shown in FIG. 2A-C.

[0486] Preferably therefore, the nucleic acids of the invention includenucleic acid molecules comprising the nucleotide sequence of HKNG1-V 1or encoding the polypeptide encoded by HKNG1-V1 in the absence ofheterologous sequences (e.g., cloning vector sequences such as LafmidBA; oligo dT primer, and Hind III adaptor).

[0487] HKNG1 Gene Expression:

[0488] HKNG1 gene expression was examined by Northern blot analysis invarious human tissues. A transcript of approximately 2 kb was detectedin fetal brain, lung and kidney, and in adult brain, kidney, pancreas,prostate, testis, ovary, stomach, thyroid, spinal cord, lymph node andtrachea. An approximately 1.5 kb transcript was also seen in trachea. Inaddition, a larger transcript of approximately 5 kb was detected in alladult neural regions tested (that is, cerebellum, cortex, medulla,spinal cord, occipital pole, frontal lobe, temporal, putamen, amygdala,caudatte nucleus, corpus callosum, hippocampus, whole brain, substantianigra, subthalamic nucleus and thalamus). Once again, this is in directcontrast to previous Northern analysis of the RPP gene, which reportedthat expression was limited to the retina (Shimizu-Matsumoto, A. et al.,1997, Invest. Ophthalmal. Vis. Sci. 38:2576-2585).

[0489] Analysis of HKNG1 the tissue distribution was extended through anin situ hybridization analysis. In particular, the HKNG1 mRNAdistribution in normal human brain tissue was analyzed. The results ofthis analysis are depicted in FIGS. 4A and 4B. As summarized in FIGS. 4Aand 4B, HKNG1 is expressed throughout the brain, with transcripts beinglocalized to neuronal and grey matter cell types.

[0490] Finally, expression of HKNG1 in recombinant cells demonstratesthat the HKNG1 gene encodes a secreted polypeptide(s).

8. EXAMPLE A Missense Mutation Within HKNG1 Correlates With Bad

[0491] The Example presented in Section 6, above, shows that the BADdisorder maps to an interval completely contained within the HKNG1 geneof the short arm of human chromosome 18. The Example presented inSection 7, above, characterizes the HKNG1 gene and gene products. Theresults presented in this Example further these studies by identifying amutation within the coding region of a HKNG1 allele of an individualexhibiting a BAD disorder.

[0492] Thus, the results described herein demonstrate a positivecorrelation between a mutation which encodes a non-wild-type HKNG1polypeptide and the appearance of the neuropsychiatric disorder BAD. Theresults presented herein, coupled with the results presented in Section6, above, identify HKNG1 as a gene which mediates neuropsychiatricdisorders such as BAD.

8.1. MATERIALS AND METHODS

[0493] Pairs of PCR primers that flank each exon (see TABLE 1, above)were made and used to PCR amplify genomic DNA isolated from BAD affectedand normal individuals. The amplified PCR products were analyzed usingSSCP gel electrophoresis or by DNA sequencing. The DNA sequences andSSCP patterns of the affected and controls were compared and variationswere further analyzed.

8.2. RESULTS

[0494] In order to more definitively show that the HKNG1 gene mediatesneuropsychiatric disorders, in particular BAD, a study was conducted toexplore whether a HKNG1 mutation that correlates with BAD could beidentified.

[0495] First, exon scanning was performed on the eleven exons originallyidentified in the HKNG1 gene using chromosomes isolated from threeaffected and one normal individual from the Costa Rican populationutilized for the LD studies discussed in Section 6, above. No obviousmutations correlating with BAD were found through this analysis.

[0496] Next, HKNG1 intron and 3′-untranslated regions within the 27 kbBAD interval were scanned by SSCP and/or sequencing for all variantsamong three affected and one normal individual from the same population.Approximately 60 variants were identified after scanning approximatelytwo-thirds of the 27 kb genomic interval, which can be genotyped andanalyzed by haplotype sharing and LD analyses, as described above, inorder to identify ones which correlate with bipolar affective disorder.FIGS. 5A-C list selected variants identified through this study.

[0497] Exon scanning using chromosomal DNA from the general populationof Costa Rica, however, successfully identified a HKNG1 missensemutation in an individual affected with BAD who did not share the commondiseased haplotype identified by the LD analysis provided above. Inparticular, exon scanning was done on exons 1-11 of HKNG1 nucleic acidfrom 129 individuals from the general population affected with BAD.

[0498] This analysis identified a point mutation in the coding region ofexon 7 not seen in non-bipolar affected disorder individuals.Specifically, the guanine corresponding to nucleotide residue 604 of SEQID NO:1 (or nucleotide residue 550 of SEQ ID NO:3) had mutated to anadenine. HKNG1 protein expressed from this mutated HKNG1 allelecomprises the substitution of a lysine residue at amino acid residue 202of SEQ ID NO:2 (or amino acid residue 184 of SEQ ID NO:4) in place ofthe wild-type glutamic acid residue.

[0499] Additional HKNG1 polymorphisms relative to the HKNG1 wild-typesequence, and which, therefore, represent HKNG1 alleles, were identifiedthrough sequence analysis of the HKNG1 alleles within a collection ofschizophrenic patients of mixed ethnicity from the United States andwithin a BAD collection from the San Francisco area. These variants aredepicted in FIGS. 5A and 5B, respectively. Statistical analysisindicated that there were significantly more variants in the collectionof schizophrenic patients of mixed ethnicity from the United States andthe San Francisco BAD and Costa Rican BAD samples than in a collectionof 242 controls (p<0.05).

9. EXAMPLE Identification of Additional HKNG1 Splice Varients

[0500] This example describes the isolation and identification of novelsplice variants of the human HKNG1 gene. Three internal splice variantswere identified by screening a human retinal cDNA library or by RT-PCRanalysis. In addition, many 3′ alternative splice variants were isolatedand identified by Rapid Amplification of cDNA Ends (RACE).

9.1. MATERIALS AND METHODS

[0501] A human retinal cDNA library was screened to isolate a novelHKNG1 clone by using probes. RT-PCR was also performed to isolateadditional HKNG1 sequences using the following primer sequences:5′-AGTTGCGTCCCTGTCTGTTG-3′ (SEQ ID NO:67) 5′-GCTTCATGTTCCCGCTGTTA-3′(SEQ ID NO:68)

[0502] To investigate the possibility of alternate splice variants atthe 3′ end of the HKNG1 gene, 3′ Rapid Amplification of cDNA Ends(“RACE”) was performed using Clontech Marathon Ready cDNA derived frombrain, kidney and retina. Briefly, PCR was performed by using a ClontechAdvantage-GC cDNA PCR Kit with 2-5 μl cDNA samples described above, 1×reaction buffer, 200 μM each dNTP, 1M GC Melt, 1× Advantage-GCPolymerase Mix, and 20 pmole each primer in a final volume of 50 μl.Lastly, PCR products were gel-purified and ligated into pGem T Easy(Promega), and positive clones were sequenced using standarddye-terminator chemistry.

[0503] To identify splice variants in exon 10 of HKNG1, the followingtwo primers, one forward primer in exon 9 (9F) and one reverse primer inexon 11 (11R) of HKNG1, were used in RACE.  9F 5′-ACT GTC CTG ATG TACCTG CTC TGC-3′ 11R 5′-CAA AGA ACT ACT AAT GTA CCA TG-3′

[0504] PCR was performed with 2 μl cDNA described above with cyclingparameters of 94° C./3′×1, (94° C. for 30 second, 60° C. for 30 seconds,72° C. for 45 seconds)×35; 72° C. for 7 minutes×1; hold at 4° C.

[0505] To identify other 3′ splice variants, the following two primers,one forward primer in exon 9 (9F) and one reverse primer in the poly Aregion (AP2), were used in RACE.  9F 5′-ACT GTC CTG ATG TAC CTG CTCTGC-3′ AP2 5′-ACT CAC TAT AGG GCT CGA GCG GC-3′

[0506] 5 μL cDNA described above was used in PCR with the followingcycling parameters: 95° C. for 3 minutes'1, (95° C. for 30 seconds; 72°C. for 30 seconds, and 72° C. for 1 minute)×2; lower annealingtemperature by 2° C. every 2 cycles until 62° C.; then (95° C. for 30seconds, 55° C. for 30 seconds, 72° C. for 1 minute)×25; 72° C. for 7minutes×1; then hold at 4° C.

9.2. RESULTS

[0507] A novel HKNG1 clone was isolated from a human retinal cDNAlibrary. This clone, which completely lacks exon 7 of the full lengthHKNG1 cDNA sequence, is referred to herein as HKNG1Δ7. Because thedeletion of exon 7 from the full length HKNG1 sequence leads to animmediate frameshift, the clone HKNG1Δ7 encodes a truncated form of theHKNG1 protein. The HKNG1Δ7 cDNA sequence (SEQ ID NO:65) is depicted inFIGS. 18A-18C along with the predicted amino acid sequence (SEQ IDNO:66) of the HKNG1Δ7 gene product it encodes.

[0508] Two other novel internal splice variants, referred to herein asHKNG1-V2 and HKNG1-V3, were isolated and identified by RT-PCR analysis.The RT-PCR product derived from HKNG1-V2 includes a novel exon referredto as “exon 2′”, whereas the RT-PCR product derived from HKNG1-V3includes a novel exon referred to as “exon 2″”. The sequence of thesenovel exons are provided in Table 2 below. The nucleotide sequence ofthe HKNG1-V2 RT-PCR product containing novel exon 2′ is depicted in FIG.6A (SEQ ID NO:36), whereas the HKNG1-V3 RT-PCR product containing novelexon 2″ is depicted in FIG. 6B (SEQ ID NO:37). Both exon 2′ and 2″ arepart of the 5′-untranslated region of the HKNG1 cDNA. The intron/exonorganization of HKNG1 is summarized in FIG. 19. TABLE 2 Exon 2′5′-TTCCCTCCCTTTGGAACGCAGCGT (SEQ ID NO:34) GGGCACCTGCAACGCAGAGACCACTGTATCCCCGGTGCAGAATGTAATGAGTGC CTGATACATTTGCCGAATAAACTATTCCAAGGGTTGAACTTGCTGGAAGCAAGA GAAGCACTATTCTGG-3′ Exon 2″5′-ATGGAGTCTTGGTCTCGTTGCCCA (SEQ ID NO:35) GACTGGAGTGCACTGCTGCGATCTCAGCTCACTGCAACCTCTACCTCCCAGGTT CAAGCGATTCTCCTGCCTCAGCCTCTCGAGTGGCTGGGACTATAG-3′

[0509] To investigate the possibility of alternate splice variants atthe 3′ end of the HKNG1 gene, 3′ RACE was performed according to theabove-described methods. Novel RT-PCR sequences were isolated whichsuggest the existence of at least three novel 3′ splice variants ofHKNG1. The first such splice variant, which is referred to herein asHKNG1Δ10 and is depicted schematically in FIG. 20B, does not containExon 10 of the HKNG1 genomic sequence depicted in FIGS. 3A-1-3A-28. TheRT-PCR sequence corresponding to this splice variant is shown in FIG.21A (SEQ ID NO:121). Removal of Exon 10 from the HKNG1 cDNA is predictedto cause a frame shift. Thus, the HKNG1Δ10 splice variant is predictedto encode a novel gene product, which is depicted in FIGS. 21B-1 and21B-2 (SEQ ID NO:131). Specifically, the predicted HKNG1Δ10 gene productcomprises the sequence corresponding to amino acid residues 1-428 of thefull length HKNG1 gene product shown in FIGS. 1A-1C (SEQ ID NO:2),followed by the novel carboxy-terminal sequence “RRSNASYIQ” (SEQ IDNO:132).

[0510] A second 3′ splice splice variant, which is shown schematicallyin FIG. 20C, contains Exons 9 and 10 of the HKNG1 genomic sequence andfurther comprises sequences which were previously identified as HKNG1intronic sequences. Specifically, such a splice variant, which isreferred to herein as “HKNG1+intron10,” further comprises an additional125 bases of nucleotide sequence corresponding to the region that wasoriginally identified as Intron 10 (i.e., the “intronic” sequencebetween Exons 10 and 11 in FIGS. 3A-1-3A-28). The RT-PCR sequencecorresponding to this splice variant is shown in FIG. 22 (SEQ ID NO:122). Because the additional sequences of this splice variant are withinthe predicted 5 ′-untranslated region of the HKNG1 +intron10 cDNAsequence, this splice variant is predicted to encode a gene product thatis identical to the full length HKNG1 gene product shown in FIGS. 1A-1C(SEQ ID NO:2).

[0511] The third 3′ splice variant,which is shown schematically in FIG.20D, is referred to herein as “HKNG1+10′.” The RT-PCR fragment isolatedfrom this variant is shown in FIG. 23A, and suggests that the splicevariant comprises sequences from a novel Exon, referred to herein asExon 10′, which is located between Exons 10 and 11 of the HKNG1 genomicsequence shown in FIGS. 3A-1-3A-28. The addition of the novel Exon 10′to the cDNA sequence of this splice variant, introduces an immediateSTOP codon. Thus, the 3′ splice variant HKNG1+10′ is predicted to encodea gene product, depicted in FIGS. 23B and 23C, whose sequence isidentical to the sequence of amino acid residues 1-494 of the fulllength HKNG1 gene product (shown in FIGS. 1A-1C; SEQ ID NO:2) but doesnot include the final tryptophan amino acid residue at position 495 ofthe full length HKNG1 gene product sequence (SEQ ID NO:133).

[0512] Many of the above-described clones which were identified by 3′RACE lacked a polyA tract which is normally seen in 3′ RACE productsderived using the methods described hereinabove, suggesting that theclones are, in fact 5′ RACE products produced by a sequence encoded bythe DNA strand that lies opposite the HKNG1 gene or human chromosome18p.

[0513] The different HKNG1 splice variants identified are summarized inTable 3, below. TABLE 3 HKNG1 splice variants Description HKNG1−V1containing a deletion of exon 7 HKNG1−V2 containing novel exon 2′HKNG1−V3 containing novel exon 2″ HKNG1Δ10 containing a deletion of exon10 HKNG1+intron10 containing exon 9 and 10, extending into intron 10HKNG1+10′ containing novel Exon 10′ between Exons 10 and 11.

10. EXAMPLE Identification of HKNG1 Orthrologs

[0514] This example describes the isolation and characterization ofgenes in other mammalian species which are orthologs to human HKNG1.Specifically, both guinea pig and bovine HKNG1 sequences are described.

10.1. GUINEA PIG HKNG1 ORTHOLOGS

[0515] A guinea pig HKNG1 ortholog, referred to as gphkng1815, wasisolated from a 104C1 cell line cDNA library by hybridization to a ³²Plabeled human HKNG1 cDNA probe. The cDNA sequence (SEQ ID NO:38) andpredicted amino acid sequence (SEQ ID NO:39) are depicted in FIGS.7A-7C. Both the nucleotide and the predicted amino acid sequence ofgphkng1815 are similar to the human HKNG1 nucleotide and amino acidsequences. Specifically, the program ALIGNv2.0 identified a 71.5%nucleotide sequence identity and a 62.8% amino acid sequence identityusing standard parameters (Scoring Matrix: PAM120; GAP penalties:−12/−4).

[0516] Like the human HKNG1 polypeptide, the predicted gphkng1815polypeptide also contains two clusterin similarity domains, whichcorrespond to amino acid residues 105 to 131 of the full length gnkh1815polypeptide (clusterin domain 1; SEQ ID NO:127), and amino acid residues305-333 of the full length gphkng1815 polypeptide (clusterin domain 2;SEQ ID NO:128), respectively. One of these domains contain the fiveconserved cysteine residues typically associated with clusterin domains.The other domain contains four of the five cysteine residues.Specifically, these conserved cysteines correspond to Cys105, Cys116,Cys119, Cys124 and Cys131 (clusterin similarity domain 1) and Cys314,Cys321, Cys324, and Cys332 (clusterin similarity domain 2) of the gphkng1815 polypeptide sequence (FIG. 7A).

[0517] Three allelic variants of gphkng 1815, referred to as gphkng 7b,gphkng 7c, and gphkng 7d, respectively, were also identified by RT-PCR.Their nucleotide [SEQ ID NO:40 (gphkng 7b), SEQ ID NO:42 (gphkng 7c),and SEQ ID NO:44 (gphkng 7d)] and amino acid [SEQ ID NO:41 (gphkng 7b),SEQ ID NO:43 (gphkng 7c), and SEQ ID NO:45 (gphkng 7d)] sequences aredepicted in FIGS. 8A-10C, respectively. Each of these three allelicvariants contains a deletion within a region homologous to exon 7 ofhuman HKNG1. The allelic variants retain the open reading frame of thegene, however, each allelic variant contains a deletion, relative togphkng 1815, of 16, 92, and 93 amino acid residues, respectively.

[0518] Alignments of the predicted nucleotide and amino acid sequencesof gphkng1815, gphkng7b, gphkng7c, and gphkng7d, as well as the“Majority” sequence, are shown in FIGS. 14A-M.

10.2. BOVINE HKNG1 ORTHOLOGS

[0519] Bovine orthologs of HKNG1 were cloned by screening a cDNA librarymade from pooled bovine retinal tissue using a nucleotide sequence thatcorresponded to the complementary sequence of base pairs 910-1422 of thefull length human HKNG1 cDNA sequence (SEQ ID NO:1) as a probe. Threeindependent bovine cDNA species, referred to as bhkng1, bhkng2, andbhkng3 (SEQ ID NOs: 46 to 48, respectively) were isolated. Each of theseallelic variants contains several single nucleotide polymorphisms(SNPs). None of the SNPs results in an altered predicted amino acidsequence. Thus, all three bovine cDNAs encode the same predicted aminoacid sequence (SEQ ID NO:49). These SNPs apparently reflect the naturalallelic variation of the pooled cDNA library from which the sequenceswere isolated. Each of the three bovine HKNG1 allelic variants isdepicted in FIGS. 11A-13C, respectively, along with the predicted aminoacid sequence which they encode. An alignment of the nucleotidesequences of each of these bovine cDNA species (i.e., of bhkng1, bhkng2,and bhkng3) is shown in FIGS. 15A-15F.

[0520] The predicted bovine HKNG1 polypeptide also contains twoclusterin similarity domains, corresponding to amino acid residues105-131 (bovine clusterin similarity domain 1; SEQ ID NO:129)and aminoacid residues 304-332 (bovine clusterin similarity domain 2; SEQ ID NO:130), respectively, of SEQ ID NO:49. Bovine clusterin similarity domain1 contains the five shared cysteine amino acid residues typicallyassociated with this type of domain: Cys105, Cys116, Cys119, Cys124, andCys131. Bovine clusterin similarity domain 2 contains four conservedcysteine residues: Cys315, Cys322, Cys325, and Cys333 (FIG. 13A).

[0521] An alignment of the predicted amino acid sequences of the humanHKNG1 gene product, the guinea pig HKNG1 ortholog gphkng1815, and thebovine HKNG1 ortholog described in Subsection 10.2 below is shown inFIG. 16. The high degree of sequence identity between these orthologswhich is described above and apparent from these alignments, confirmsthat true HKNG1 orthologs can found in diverse mammalian species, thusvalidating methods such as those described in Section 5.6.4, below.

11. EXAMPLE Expression of Human HKNG1 Gene Product

[0522] This Example describes the construction of expression vectors andthe successful expression of recombinant human HKNG1 sequences.Expression vectors are described both for native HKNG1 and for variousHKNG1 fusion proteins.

[0523] Expression of Human HKNG1:FLAG:

[0524] A human HKNG1 flag epitope-tagged protein (HKNG1 :flag) vectorwas constructed by PCR followed by ligation into an vector forexpression in HEK 293T cells. The full open-reading frame of the fulllength HKNG1 cDNA sequence (SEQ ID NO:5) was PCR amplified using thefollowing primer sequences: 5′ primer: 5′-TTTTTCTGAATTCGCCACCAT (SEQ IDNO:52) GAAAATTAAAGCAGAGAAAAAC G-3′ 3′ primer: 5′-TTTTTGTCGACTTATCACTTG(SEQ ID NO:53) TCGTCGTCGTCCTTGTAGTCCCAG GTTTTAAAATGTTCCTTAAAATG C-3′.

[0525] The 5′ primer incorporated a Kozak sequence upstream of theinitiator methionine in exon 3. The 3′ primer included the nucleotidesequence encoding the flag epitope DYKDDDDK (SEQ ID NO:50) followed by atermination codon.

[0526] The sequenced DNA construct was transiently transfected into HEK293T cells in 150 mm plates using Lippfectamine (GIBCO/BRL) according tothe manufacturer's protocol. Seventy-two hours post-transfection, theserum-free conditioned medium (OptiMEM, GIBCO/BRL) was harvested andspun and the remaining monolayer of cells was lysed using 2 ml of lysisbuffer [50 mM Tris pH 8.0, 150 mM NaCl, 1% NP-40, 0.05% SDS with“Complete” protease cocktail (Boehringer Mannheim) diluted according tomanufacturers instructions]. Insoluble material was pelleted beforepreparation of SDS-PAGE samples.

[0527] Conditioned medium was electroblotted onto a PVDF membrane(Novex) after separation by SDS-PAGE on 4-20% gradient gels and probedwith an M2 anti-flag monoclonal antibody (1:500, Sigma) followed byhorseradish peroxidase (HRP) conjugated sheep anti-mouse antibody(1:5000, Amersham), developed using chemiluminescent reagents(Renaissance, Dupont), and exposed to autoradiography film (Biomax MR2film, Kodak). Flag immunoreactivity appeared as a doublet of bands thatmigrated by SDS-PAGE between 60 and 95 kDa as determined by Multimarkmolecular weight markers (Novex), demonstrating secretion of theHKNG1:Flag protein. The double band indicates at least two differentspecies with different mobilities on SDS-PAGE. Such doublets mostcommonly arise with posttranslational modifications to the protein, suchas glycosylation and/or proteolysis. Treatment of the PNGase F (OxfordGlycosciences) according to the manufacturer's directions resulted in asingle band of increased mobility, indicating that two original bandscontain N-linked carbohydrate. When run in the absence of a reducingagent, the relative mobility of the immunoreactive bands was greaterthan 100 kDa relative to the same markers, indicating that HKNG1:flagfusion proteins may be a disulfide linked dimer or higher oligomer.

[0528] Expression of Human HKNG1-V1:FLAG:

[0529] A human HKNG1-V1 flag epitope-tagged protein (HKNG1-V1:flag)vector was also constructed by PCR followed by ligation into anexpression vector, pMET stop. The full length open-reading frame of theHKNG1-V1 cDNA sequence (SEQ ID NO:6) was PCR amplified using thefollowing primer sequences: 5′ primer: 5′-TTTTTCTGAATTCACCATGAG (SEQ IDNO:54) GACCTGGGACTACAGTAAC-3′ 3′ primer: 5′-TTTTTGTCGACTTATCACTTG (SEQID NO:53) TCGTCGTCGTCCTTGTAGTCCCAG GTTTTAAAATGTTCCTTAAAATG C-3′.

[0530] The 5′ primer incorporated a Kozak sequence upstream of andincluding the initiator methionine in exon 2. The 3′ primer included thenucleotide sequence encoding the flag epitope DYKDDDDK (SEQ ID NO:50)followed by a termination codon.

[0531] The sequenced DNA construct was transiently transfected into HEK293T cells in 150 mm plates using Lipofectamine (GIBCO/BRL) according tothe manufacturer's protocol. Seventy-two hours post-transfection, theserum-free conditioned medium (OptiM EM, GIBCO/BRL) was harvested andspun and the remaining monolayer of cells was lysed using 2 mL of lysisbuffer [50 mM Tris pH 8.0, 150 mM NaCl, 1% NP-40, 0.05% SDS with“Complete” protease cocktail (Boehringer Mannheim) diluted according tomanufacturers instructions]. Insoluble material was pelleted beforepreparation of SDS-PAGE samples.

[0532] Conditioned medium was electroblotted onto a PVDF membrane(Novex) after separation by SDS-PAGE on 4-20% gradient gels and probedwith an M2 anti-flag monoclonal antibody (1:500, Sigma) followed byhorseradish peroxidase (HRP) conjugated sheep anti-mouse antibody(1:5000, Amersham), developed using chemiluminescent reagents(Renaissance, Dupont), and exposed to autoradiography film (Biomax MR2film, Kodak). Flag immunoreactivity appeared as a doublet of bands thatmigrated by SDS-PAGE between 60 and 95 kDa as determined by Multimarkmolecular weight markers (Novex), demonstrating secretion of theHKNG1:Flag protein. When run in the absence of reducing agent, therelative mobility of the immunoreactive bands was greater than 100 kDarelative to the same markers, suggesting that the HKNG1-V1:flag fusionprotein may be a disulfide linked dimer or higher oligomer.

[0533] Expression of Human HKNG1:Fc:

[0534] A human HKNG1/hIgG1Fc fusion protein vector was constructed byPCR. The open-reading frame of the HKNG1 cDNA (SEQ ID NO:5), from theiniator methionine in exon 3 to the amino acid residue before the stopcodon, was PCR amplified using the following primer sequences: 5′ primer5′-TTTTTCTCTCGAGACCATGAAA (SEQ ID NO:55) ATTAAAGCAGAGAAAAACG-3′ 3′primer 5′-TTTTTGGATCCGCTGCTGCCCA (SEQ ID NO:56) GGTTTTAAAATGTTCCTTAAAATGC-3′

[0535] The 5′ primer incorporated a Kozak sequence upstream of theinitiator methionine in exon 3. The 3′ PCR primer contained a 3 alaninelinker at the junction of HKNG1 and the human IgG1 Fc domain, whichstarts at residues DPE. The genomic sequence of the human IgG1 Fc domainwas ligated along with the PCR product into a pCDM8 vector (Invitrogen,Carlsbad Calif.) for transient expression.

[0536] The sequenced DNA construct was transiently transfected into HEK293T cells in 150 mm plates using Lipofectamine (GIBCO/BRL) according tothe manufacturer's protocol. Seventy-two hours post-transfection, theserum-free conditioned medium (OptiMEM, GIBCO/BRL) was harvested andspun and the remaining monolayer of cells was lysed using 2 mL of lysisbuffer [50 mM Tris pH 8.0, 150 mM NaCl, 1% NP-40, 0.05% SDS with“Complete” protease cocktail (Boehringer Mannheim) diluted according tomanufacturers instructions]. Insoluble material was pelleted beforepreparation of SDS-PAGE samples.

[0537] Conditioned medium was electroblotted onto a PVDF membrane(Novex) after separation by SDS-PAGE on 4-20% gradient gels and probedwith an anti-Fc polyclonal antibody (1:500, Jackson ImmunoResearchLaboratories, Inc.) followed by horseradish peroxidase (HRP) conjugatedsheep anti-mouse antibody (1:5000, Amersham), developed usingchemiluminescent reagents (Renaissance, Dupont), and exposed toautoradiography film (Biomax MR2 film, Kodak). Human IgG1 Fcimmunoreactivity appeared as a doublet of bands that migrated bySDS-PAGE between 148 and 60 kDa standards of the Multimark molecularweight markers (Novex), demonstrating secretion of the HKNG1:Fc fusionprotein.

[0538] Expression of Human HKNG1-V1:Fc:

[0539] A human HKNG1-V1/hIgG1Fc fusion protein (HKNG1-V1:Fc) vector wasalso constructed by PCR. The full-length open reading frame of HKNG1-V1cDNA (SEQ ID NO:6) from the initiator methionine in exon 2 to the aminoacid residue before the stop codon, was PCR amplified using thefollowing primer sequences: 5′ primer 5′-TTTTTCTCTCGAGACCATGAG (SEQ IDNO:57) GACCTGGGACTACAGTAAC-3′ 3′ primer 5′-TTTTTGGATCCGCTGCTGCCC (SEQ IDNO:56) AGGTTTTAAAATGTTCCTTAAAAT GC-3′

[0540] The 5′ primer incorporated a Kozak sequence upstream of theinitiator methionine in exon 2. The 3′ PCR primer contained a 3 alaninelinker at the junction of HKNG1-V1 and the human IgG1 Fc domain, whichstarts at residues DPE. The genomic sequence of the human IgG1 Fc domainwas ligated along with the PCR product into a pCDM8 vector for transientexpression.

[0541] The sequenced DNA construct was transiently transfected into HEK293T cells in 150 mm plates using Lipofectamine (GIBCO/BRL) according tothe manufacturer's protocol. Seventy-two hours post-transfection, theserum-free conditioned medium (OptiMEM, GIBCO/BRL) was harvested andspun and the remaining monolayer of cells was lysed using 2 mL of lysisbuffer [50 mM Tris pH 8.0, 150 mM NaCl, 1% NP-40, 0.05% SDS with“Complete” protease cocktail (Boehringer Mannheim) diluted according tomanufacturers instructions]. Insoluble material was pelleted beforepreparation of SDS-PAGE samples.

[0542] Conditioned medium was electroblotted onto a PVDF membrane(Novex) after separation by SDS-PAGE on 4-20% gradient gels and probedwith an anti-human Fc polyclonal antibody (1:500, Jackson ImmunoResearchLaboratories, Inc.) followed by horseradish peroxidase (HRP) conjugatedsheep anti-mouse antibody (1:5000, Amersham), developed usingchemiluminescent reagents (Renaissance, Dupont), and exposed toautoradiography film (Biomax MR2 film, Kodak). Human IgG1 Fcimmunoreactivity appeared as a doublet of bands that migrated bySDS-PAGE between 148 and 60 kDa standards of the Multimark molecularweight markers (Novex) centered approximately between 125 and 150 kDa,demonstrating secretion mediated by the HKNG1 signal peptide.

[0543] Expression of Human HKNG1Δ7:Fc:

[0544] A human HKNG1Δ7:hIgG1Fc fusion protein vector was alsoconstructed by PCR. The sequence of the HKNG1Δ7 splice variant, from theinitiator methionine in exon 4 through the end of exon 6, was PCRamplified using the HKNG1 cDNA sequence (SEQ ID NO:1) as a template andwith the following primer sequences: 5′ primer 5′-TTTTTCTGAATTCACCATGAA(SEQ ID NO:58) GCCGCCACTCTTGGTG-3′ 3′ primer 5′-TTTTTGGATCCGCTGCGGCCT(SEQ ID NO:59) CCGTGGTCAGGAGCTTATTTTTCA CAGAGGACCAGCTAG-3′.

[0545] The 5′ primer incorporated a Kozak sequence upstream of theinitiator methionine in exon 4. The 3′ primer included the first 17(coding) nucleotides of exon 8 followed by nucleotides encoding a 3alanine linker.

[0546] The genomic sequence of the human IgG1 Fc domain was ligatedalong with the PCR product into a pCDM8 vector for transient expression.

[0547] The sequenced DNA construct was transiently transfected into HEK293T cells in 150 mm plates using Lipofectamine (GIBCO/BRL) according tothe manufacturer's protocol. Seventy-two hours post-transfection, theserum-free conditioned medium (OptiMEM, GIBCO/BRL) was harvested andspun and the remaining monolayer of cells was lysed using 2 mL of lysisbuffer [50 MM Tris pH 8.0, 150 mM NaCl, 1% NP-40, 0.05% SDS with“Complete” protease cocktail (Boehringer Mannheim) diluted according tomanufacturers instructions]. Insoluble material was pelleted beforepreparation of SDS-PAGE samples.

[0548] Conditioned medium was electroblotted onto a PVDF membrane(Novex) after separation by SDS-PAGE on 4-20% gradient gels and probedwith an anti-human Fc polyclonal antibody (1:500, Jackson ImmunoResearchLaboratories) followed by horseradish peroxidase (HRP) conjugated sheepanti-mouse antibody (1:5000, Amersham), developed using chemiluminescentreagents (Renaissance, Dupont), and exposed to autoradiography film(Biomax MR2 film, Kodak). Human IgG1 Fc immunoreactivity appeared as aband that migrated by SDS-PAGE between 42 and 60 kDa relative toMultimark molecular weight markers (Novex) centered approximatelybetween 36.5 and 55.4 kDa relative to Mark 12 molecular weight markers(Novex).

[0549] Expression of Native Human HKNG1:

[0550] A human HKNG1 expression vector was constructed by PCRamplification of the human HKNG1 cDNA sequence (SEQ ID NO:1) followed byligation into an expression vector, pcDNA3.1 (Invitrogen, CarlsbadCalif.). The full open-reading frame of the HKNG1 cDNA sequence (SEQ IDNO:5) was PCR amplified using the following primer sequences: 5′ primer5′-TTTTTCTCTCGAGGACTACAGGA (SEQ ID NO:60) CACAGCTAAATCC-3′ 3′ primer5′-TTTTTGGATCCTTATCACCAGGT (SEQ ID NO:61) TTTAAAATGTTCCTTAAAATGC-3′

[0551] The 3′ primer included a tandem pair of termination codons.

[0552] The sequenced DNA construct was transiently transfected into HEK293T cells in 150 mm plates using Lipofectamine (GIBCO/BRL) according tothe manufacturer's protocol. Seventy-two hours post-transfection, theserum-free conditioned medium (OptiMEM, GIBCO/BRL) was harvested andspun and the remaining monolayer of cells was lysed using 2 mL of lysisbuffer [50 mM Tris pH 8.0, 150 mM NaCl, 1% NP-40, 0.05% SDS with“Complete” protease cocktail (Boehringer Mannheim) diluted according tomanufacturers instructions]. Insoluble material was pelleted beforepreparation of SDS-PAGE samples.

[0553] Conditioned medium was electroblotted onto a PVDF membrane(Novex) after separation by SDS-PAGE on 4-20% gradient gels and probedwith an anti-HKNG1 polyclonal antibody (#84, 1:500) followed byhorseradish peroxidase (HRP) conjugated donkey anti-rabbit antibody(1:5000, Amersham), developed using chemiluminescent reagents(Renaissance, Dupont), and exposed to autoradiography film (Biomax MR2film, Kodak). HKNG1 immunoreactivity appeared as a doublet of bands thatmigrated by SDS-PAGE between 60 and 95 kDa as determined by Multimarkmolecular weight markers (Novex).

[0554] Expression of Native Human HKNG1-V1:

[0555] A human HKNG1-V1 expression vector was also constructed by PCRamplification of the human HKNG1-V1 cDNA sequence (SEQ ID NO:3) followedby ligation into an expression vector, pcDNA3.1. The full open-readingframe of the HKNG1 cDNA sequence (SEQ ID NO:6) was PCR amplified usingthe following primer sequences: 5′ primer 5′-TTTTTCTGAATTCACCATGAAGC(SEQ ID NO:62) CGCCACTCTTGGTG-3′ 5′ primer 5′-TTTTTCTCTCGAGACCATGAGGA(SEQ NO:63) CCTGGGACTACAGTAAC-3′ 3′ primer 5′-TTTTTGGATCCTTATCACGAGGT(SEQ ID NO:61) TTTAAAATGTTCCTTAAAATGC-3′

[0556] Each of the 5′ primers incorporates a Kozak sequence upstream ofthe intiator methionine. Use of the first 5′ primer (SEQ ID NO:62)drives expression of HKNG1 from the methionine initiator codon in exon4. Whereas use of the second 5′ primer (SEQ ID NO:63) preferentiallydrives expression of HKNG1 from the methionine initiator codon in exon2, although some translation may initiate in exon 4. The 3′ primerincluded a tandem pair of termination codons. The sequenced DNAconstruct was transiently transfected into HEK 293T cells in 150 mmplates using Lipofectamine (GIBCO/BRL) according to the manufacturer'sprotocol. Seventy-two hours post-transfection, the serum-freeconditioned medium (OptiMEM, GIBCO/BRL) was harvested and spun and theremaining monolayer of cells was lysed using 2 mL of lysis buffer [50 mMTris pH 8.0, 150 mM NaCl, 1% NP-40, 0.05% SDS with “Complete” proteasecocktail (Boehringer Mannheim) diluted according to manufacturersinstructions]. Insoluble material was pelleted before preparation ofSDS-PAGE samples.

[0557] Conditioned medium was electroblotted onto a PVDF membrane(Novex) after separation by SDS-PAGE on 4-20% gradient gels and probedwith an anti-HKNG1 polyclonal antibody (#84, 1:500) followed byhorseradish peroxidase (HRP) conjugated donkey anti-rabbit antibody(1:5000, Amersham), developed using chemiluminescent reagents(Renaissance, Dupont), and exposed to autoradiography film (Biomax MR2film, Kodak). HKNG1 immunoreactivity appeared as a doublet of bands thatmigrated by SDS-PAGE between 70 and 95 kDa as determined by Multimarkmolecular weight markers (Novex), demonstrating secretion mediated bythe HKNG1 signal peptide.

[0558] Expression of Human HKNG:AP Fusion Proteins:

[0559] Expression vectors were also constructed for human HKNG1 alkalinephosphatase C-terminal fusion protein (HKNG1:AP), human HKNG1-V1alkaline phosphatase C-terminal fusion protein (HKNG1-V1:AP), and humanHKNG1 alkaline phosphatase N-terminal fusion protein (AP:HKNG1).

[0560] The expression vector for human HKNG1:AP was constructed by PCRamplification followed by ligation into a vector for suitable forexpression in HEK 293T cells. The full-length open-reading frame ofhuman HKNG1 (SEQ ID NO:5) was PCR amplified using a 5′ primerincorporating an EcoRI restriction site followed by a Kozak sequenceprior to the upstream initiator methionine. The 3′ primer included aXhoI restriction site immediately following the final (non-termination)codon of HKNG1. Thus, the open reading frame of the construct includesthe HKNG1 signal peptide and the full HKNG1 sequence followed by thefull sequence of human placental alkaline phosphatase.

[0561] The expression vector for human HKNG1-V1:AP was constructed byPCR amplification followed by ligation into pN8 epsilon vector. The fulllength open reading frame of human HKNG1-V1 (SEQ ID NO:6) was PCRamplified using a 5′ primer incorporating an EcoRI restriction sitefollowed by a Kozak sequence prior to the upstream initiator methionine.The 3′ primer included a XhoI restriction site immediately following thefinal codon of HKNG1-V1. Thus, the open reading frame of the constructincludes the HKNG1-V1 signal and the full length HKNG1-V1 sequencefollowed by the full sequence of human placental alkaline phosphatase.

[0562] The expression vector for human AP:HKNG1 was constructed by PCRamplification followed by ligation into the AP-Tag3 vector reported byCheng and Flanagan, 1994, Cell 79:157-168. The full-length open-readingframe of human HKNG1 (SEQ ID NO:5) was PCR amplified using a 5′ primerincorporating a Bam-HI restriction site prior to the nucleotidesencoding the first amino acids (i.e., APT) of the mature HKNG1protein,and a 3′ primer that included a XhoI restriction site immediatelyfollowing the termination codon of HKNG1. Thus, the open reading frameof the complete construct includes the AP signal peptide and the fullsequence of human placental alkaline phosphatase, followed by the fullHKNG1 sequence.

[0563] The sequenced DNA constructs were transiently transfected in HEK293T cells in 150 mM plates using Lipofectamine (GIBCO/BRL) according tothe manufacturer's protocol. 72 hours post-transfection, the serum-freeconditioned media (OptiMEM, Gibco/BRL) were harvested, spun andfiltered. Alkaline phosphatase activity in the conditioned media wasquantitated using an enzymatic assay kit (Phospha-Light, Tropix)according to the manufacturer's instructions. When alkaline phosphatasefusion protein concentrations below 2 nM were observed, conditionedmedium was concentrated by centrifugation using a 30 kDa cut-offmembrane. Conditioned medium samples before and after concentration wereanalyzed by SDS-PAGE followed by Western blot using anti-human alkalinephosphatase antibodies (1:250, Genzyme) and chemiluminsecent detection.A band at 140 kDa was observed in concentrated supernatant of HKNG1:AP,HKNG1-V1:AP, and AP:HKNG1 transfections. Conditioned medium samples wereadjusted to 10% fetal calf serum and stored at 4° C.

[0564] Purification of Flag-Tagged HKNG1 Proteins:

[0565] The secreted flag-tagged proteins described above were isolatedby a one step purification scheme utilizing the affinity of the flagepitope to M2 anti-flag antibodies. The conditioned media was passedover an M2-biotin (Sigma)/streptavidin Poros column (2.1×30 mm, PEBiosystems). The column was then washed with PBS, pH 7.4, andflag-tagged protein was eluted with 200 mM glycine, pH 3.0. Fractionswere neutralized with 1.0 M Tris pH 8.0. Eluted fractions with 280 nmabsorbance greater than background were then analyzed on SDS-PAGE gelsand by Western blot. The fractions containing flag-taged protein werepooled and dialyzed in 8000 MWCO dialysis tubing against 2 changes of 4LPBS, pH 7.4 at 4° C. with constant stirring. The buffered exchangedmaterial was then sterile filtered (0.2 μm, Millipore) and frozen at−80° C.

[0566] Purification of HKNG1:Fc Fusion Proteins:

[0567] The secreted Fc fusion proteins described above were isolated bya one step purification scheme utilizing the affinity of the human IgG1Fc domain to Protein A. The conditioned media was passed over a POROS Acolumn (4.6×100 mm, PerSeptive Biosystems); the column was then washedwith PBS, pH 7.4 and eluted with 200 mM glycine, pH 3.0. Fractions wereneutralized with 1.0 M Tris pH 8.0. A constant flow rate of 7 ml/min wasmaintained throughout the procedure. Eluted fractions with 280 nmabsorbance greater than background were then analyzed on SDS-PAGE gelsand by Western blot. The fractions containing Fc fusion protein werepooled and dialyzed in 8000 MWCO dialysis tubing against 2 changes of 4LPBS, pH 7.4 at 4° C. with constant stirring. The buffered exchangedmaterial was then sterile filtered (0.2 μm, Millipore) and frozen at−80° C.

12. PRODUCTION OF ANTI-HKNG1 ANTIBODIES

[0568] The Example presented in this Section describes the productionand characterization of polyclonal and monoclonal antibodies directedagainst HKNG1 proteins.

12.1. PRODUCTION OF POLYCLONAL ANTIBODIES

[0569] Polyclonal antisera were raised in rabbits against each of thethree peptides listed in Table 4 below. Each of the peptides was derivedfrom the HKNG1 amino acid sequence (SEQ ID NO:2) by standard techniques(see, in particular, Harlow & Lane, 1988, Antibodies: A LaboratoryManual, Cold Spring Harbor Laboratory Press, the contents of which isincorporated herein by reference in its entirety). Each of the peptidesis also represented in the HKNG1-V1 polypeptide sequence (SEQ ID NO:4).Antisera was subsequently affinity purified using the peptideimmunogens. TABLE 4 a.a. residues Antibody Peptide/Immunogen (SEQ IDNO:2) Antibody 84 APTWKDKTAISENLK 50-64 Antibody 85 KAIEDLPKQDK 304-314Antibody 86 KALQHFKEHFKTW 483-495

12.2. PRODUCTION OF MONOCLONAL ANTIBODIES

[0570] Monoclonal antibodies were raised in mice by standard techniques(see, Harlow & Lane, supra) against the HKNG-Fc fusion protein describedin Section 11 above. Wells were screened by ELISA for binding to theHKNG-Fc fusion protein. Those wells reacting with the Fc protein wereidentified by ELISA for binding to an irrelevant Fc fusion protein anddiscarded. HKNG-Fc specific wells were tested for their ability toimmunoprecipitate HKNG-Fc and subjected to isotype analysis by standardtechniques (Harlow & Lane, supra), and eight wells were selected forsubcloning. The isotype of the subcloned monoclonal antibodies wasconfirmed and is presented in Table 5, below.

[0571] Based on Western blotting, immunoprecipitation and immunostainingdata discussed in Subsection 12.3, below, two monoclonal antibodies(3D17 and 4N6) were selected for large scale production. TABLE 5 CloneIsotype 1F24 2b 1J18 2a 2O20 1 3D17 2a 3D24 1 4N6 1 4O16 2b 10C6 2a

12.3. WESTERN BLOTTING AND IMMUNOPRECIPITATION OF RECOMBINANT HKNG1PROTEIN

[0572] The polyclonal antisera and all eight monoclonal antibodiesdescribed in subsection 12.1 and 12.2, above, were tested for theirability to recognize recombinant HKNG1 proteins on Western blots usingstandard techniques (see, in particular, Harlow & Lane, 1988,Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press).Polyclonal antisera 84 and 85 and monoclonal antibodies 3D17 and 4N6were able to recognize all forms of the mature (i.e., secreted)recombinant HKNG1proteins tested (i.e., HKNG1:Fc, HKNG1:flag, AP:HKNG1,and native HKNG1) in Western blots.

[0573] Table 6, below, indicates the ability of each monoclonal antibodyto immunoprecipitate recombinant HKNG1, as assessed by Western blottingof immunoprecipitates with the polyclonal antisera 84 and 85. None ofthe polyclonal antisera were able to immunoprecipitate recombinant HKNG1proteins. All eight monoclonal antibodies immunoprecipitated HKNG1:Fc.Immunoprecipitation of the other recombinant HKNG1 proteins wasvariable. TABLE 6 Protein Monoclonal HKNG1 Antibody HKNG1:Fc HKNG1:flagAP:HKNG1 (native) IF24 + + + −/+ 1J18 + − −/+ ++ 2O20 + − + − 3D17 ++ ++− ++ 3D24 + − − − 4N6 + + + + 4O16 + − − ++ 10C6 + − − +

13. EXAMPLE Confirmation of the HKNG1 N-Terminus and Characterization ofthe Disulfide Bond Structure

[0574] The experiments described in this section provide dataidentifying the N-terminus of the mature secreted human HKNG1 protein.The experiments also provide data identifying the disulfide bondlinkages between cysteine amino acid residues in the mature, secretedprotein.

[0575] Specifically, mature, secreted HKNG:flag, HKNG, and HKNG:Fcrecombinant proteins were produced and purified as described in theexample presented in Section 11, above. The mature recombinant proteinswere digested with trypsin, and the tryptic fragments were identifiedand sequenced using reverse-phase liquid chromatography coupled withelectrospray ionization tandem mass spectrometry (LC/MS/MS). TheN-terminus of all mature secreted proteins tested was unambiguouslyidentified as APTWKDKT, which corresponds to the amino acid sequencestarting at alanine 50 of the HKNG1 amino acid sequence (FIGS. 1A-C; SEQID NO:2) or alanine 32 of the HKNG1-V1 amino acid sequence (FIGS. 2A-C;SEQ ID NO:4). Thus, although the cDNA sequences of HKNG1 and HKNG1-V 1encode distinct amino acid sequences, the mature secreted proteinsproduced by these two splice variants of the human HKNG1 gene areidentical, since the alternative splicing that gives rise to HKNG1-V1(i.e., the deletion of exon 3) affects the amino acid sequence of theproteolytically cleaved signal peptide. The amino acid sequence of themature secreted HKNG1 protein is shown in FIG. 22 (SEQ ID NO:122)

[0576] The mature secreted HKNG1 protein is also distinct from the RPPamino acid sequence disclosed by Shimizu-Matsumo et al. (1997, Invest.Ophthalmal. Vis. Sci. 38:2576-2585). In particular, amino acid residues1 to 20 of the RPP amino acid sequence disclosed in FIG. 3 ofShimizu-Matsumo et al., supra, correspond to the cleaved signal peptideof HKNG1-V1.

[0577] Disulfide bond linkages for 8 of the 13 cysteine residues in themature, secreted HKNG1 protein were also identified from LC/MS/MS ofpeptides recovered from tryptic digestion of the unreduced protein. Inparticular, the following disulfide bonded pairs of cysteines wereidentified (numbering refers to the HKNG1 protein shown in FIGS. 1A-C;SEQ ID NO:2): Cys 134 to Cys 145; Cys 148 to Cys 153; Cys 160 to Cys334; and Cys 354 to Cys 362.

14. EXAMPLE Localization of HKNG1 mRNA and Protein Expression

[0578] This Example describes experiments wherein the HKNG1 gene productis shown to be expressed in human and primate brain tissue and in humanretinal tissue. Specifically, in situ hybridization experimentsperformed using standard techniques with a probe that corresponded tothe complementary sequence of base pairs 910-1422 of the full lengthhuman HKNG1 cDNA sequence (SEQ ID NO:1) detected HKNG1 messenger RNA inthe photoreceptor layer (outer nuclear layer) of human retina in eyesobtained from the New England Eye Bank.

[0579] The polyclonal antisera and all eight monoclonal antibodiesdescribed in Section 12, above, were tested for immunostaining of humanretina. Polyclonal antiserum 85 and monoclonal antibodies 1F24, 4N6 and4O16 showed immunostaining of HKNG1 protein in the photoreceptor layerand adjacent layers of the retina. The immunostaining in these tissueswith polyclonal antiserum was blocked by 85 peptide immunogen, but notby the other two peptide immunogens (i.e., 84 and 86), confirming thatthe immunostaining was due to HKNG1 protein expressed in thephotoreceptor layer.

[0580] The same antibodies were then used to localize HKNG1 protein byimmunostaining in sections of human and monkey brain. HKNG1 protein wasobserved in cortical neurons in the frontal cortex. The majority ofpyramidal neurons in layers IV-V were immunoreactive for HKNG1 protein.A subpopulation of neurons was also labeled in layers I-Ill. HKNG1immunoreactivity was also observed in the pyramidal cell layer of thehippocampus and in a small number of neurons in the striatum.

[0581] These data further support the fact that HKNG1 is, indeed, a genewhich mediates neuropsychiatric disorders such as BAD. Furthermore, thefact that HKNG1 is also expressed in human retinal tissue indicates thatthe gene also plays a role in myopic conditions. Specifically, Young etal. (1998, American Journal of Human Genetics 63:109-119) report astrong linkage (LOD=9.59) for primary myopia and secondary maculardegeneration and retinal detachment in the telomeric region of humanchromosome 18p. Through fine mapping analysis, this candidate region hasbeen narrowed to a 7.6 cM haplotype flanked by markers D18S59 andD18S1138 (Young et al., supra). The marker D18S59 lies within the HKNG1gene. This fact, coupled with the finding the HKNG1 is expressed in highlevels in the retina, strongly suggests that the HKNG1 gene is alsoresponsible for human myopia conditions and/or other eye-relateddiseases such as primary myopia, secondary macular degeneration, andretinal detachment.

15. EXAMPLE Immature Protein Products of the HKNG1 cDNA Sequences

[0582] This section describes experiments which were performed todetermine which of the two putative initiator methionines encoded byboth the full length HKNG1 cDNA and the alternatively spliced HKNG1-V1cDNA are used in the synthesis of immature (i.e., uncleaved) HKNG1protein. The results indicate that both initiator methionines are usedat varying levels, resulting in the production of three different formsof the immature HKNG1 protein, referred to herein as immature proteinform 1 (IPF1), immature protein form 2 (IPF2), and immature protein form3 (IPF3).

[0583] Both the full length HKNG1 cDNA sequence shown in FIGS. 1A-C (SEQID NO:1) and the alternatively spliced HKNG1-V1 cDNA sequence shown inFIGS. 2A-C (SEQ ID NO:3) encode predicted proteins that have methioninesin close proximity to their predicted initiator methionines. Thepredicted protein sequence encoded by the full length HKNG1 cDNAsequence has a second methionine at amino acid residue number 30 of theamino acid sequence depicted in FIGS. 1A-C (SEQ ID NO:2). Thus, althoughFIGS. 1A-C indicate that the full length HKNG1 cDNA encodes the firstimmature form of the HKNG1 protein depicted in FIGS. 1A-C (referred toherein as IPF1), the full length HKNG1 cDNA may additionally encode asecond immature protein form (referred to herein as IPF2), whosesequence (SEQ ID NO:64) is provided on the third line of the proteinalignment depicted in FIGS. 17A-17B. IPF2 is initiated at methionine 30of the IPF1 protein sequence, and is identical to the RPP polypeptidesequence taught by Shimizu-Matsumoto et al (1997, Invest. Ophthalmol.Vis. Sci. 38:2576-2585). Likewise, the alternatively spliced HKNG1-V1cDNA sequence encodes the predicted immature protein form, referred toherein as IPF3, depicted in FIGS. 2A-C (SEQ ID NO:4). However, theHKNG1-V1 cDNA may also encoded another immature protein form, identicalto IPF 2, that is initiated at methionine 12 of the IPF3 proteinsequence. FIGS. 17A and 17B illustrate an alignment of the threeimmature HKNG1 protein sequences IPF3 (bottom row), IPF2 (third row),and IPF1 (second row). As explained is Section 13 above, the matureHKNG1 gene product secreted by cells expressing the HKNG1 constructsdescribed in Section 11, above, is in fact the same cleaved product (SEQID NO:5 1), regardless of the immature HKNG1 protein (IPF1, IPF2, orIPF3) from which it is produced. An alignment of the mature secretedHKNG1 protein is, therefore, also depicted in FIGS. 17A-17B (top row).

[0584] Modified HKNG1:flag and HKNG1-V1:flag expression vectors wereconstructed as described in Sections 12.1 and 12.2, respectively.However, the nucleotide sequence of full length HKNG1 was modified,using standard site directed mutagenesis techniques, so as to introducean additional base pair between the upstream methionine (i.e., met 1 inSEQ ID NO:2) and the downstream methionine (i.e., met 30 in SEQ IDNO:2). The nucleotide sequence of HKNG1-V1 was likewise modified, usingstandard site directed mutagenesis techniques, to introduce anadditional base between its upstream methionine (i.e., met 1 in SEQ IDNO:4) and downstream methionine (i.e., met 12 in SEQ ID NO:4). Thus, inboth modified constructs, the C-terminal flag epitope tag was no longerin the same reading frame as the upstream methionine but was in framewith the downstream methionine. Consequently, exclusive translationinitiation at the first methionine of a construct would lead to theproduction of non-flag immunoreactive proteins. However, exclusivetranslation initiation at the second methionine of a construct wouldlead to the production of flag immunoreactive proteins.

[0585] Unmodified HKNG1:flag, unmodified HKNG1-V1:flag, modifiedHKNG1:flag, and modified HKNG1-V1 flag constructs were transfected intocells, and their resulting gene products were harvested, blotted onto aPVDF membrane, and probed with an M2 anti-flag polyclonal antibody, anddeveloped according to the methods described in Sections 12.1 and 12.2above.

[0586] Flag immunoreactivity was detected in all four samples. Theunmodified HKNG1:flag and HKNG1-V1:flag expression vectors producedamounts of mature secreted HKNG1:flag protein consistent with the levelsdetected in Sections 12.1 and 12.2 above. Further, the flagimmunoreactive band detected for the modified HKNG1 flag construct wasindistinguishable in intensity from the band detected for the unmodifiedHKNG1:flag construct, indicating that the immature HKNG1 proteinproduced by full length HKNG1 cDNA is predominantly IPF2, while IPF1 isproduced by full length HKNG1 cDNA in relatively minor amounts.

[0587] The flag immunoreactive band from the modified HKNG1-V1:flagconstruct had dramatically reduced intensity relative to the band fromthe unmodified HKNG1-V1:flag construct. Thus, HKNG1-V1 producesprimarily the immature HKNG1 protein IPF3, while the immature HKNG1protein IPF2 is produced by HKNG1-V1 in relatively minor amounts. Theseresults are summarized below in Table 7, below. TABLE 7 ConstructImmature Protein Prominence HKNG1 IPF1 (SEQ ID NO: 2) Minor HKNG1−V1IPF2 (SEQ ID NO: 64) Predominant IPF2 (SEQ ID NO: 64) Minor IPF3 (SEQ IDNO: 4) Predominant

[0588] Thus, the HKNG1 gene products of the invention include geneproducts corresponding to the immature protein forms IPF1 and IPF3.However, preferably the HKNG1 gene products of the invention do notinclude amino acid sequences consisting of the IPF2 sequence (SEQ IDNO:64).

16. IDENTIFICATION AND CHARACTERIZATION OF GNKH

[0589] The Example presented herein describes the identification andcharacterization of a novel gene referred to as GNKH. The genomicsequence of GNKH was found to overlap with portions of the genomicsequences of HKNG1 and a second gene, known as TS, that lies adjacent toHKNG1. In particular, the coding strand of the GNKH gene was found tolie on the opposite strand for HKNG1 and TS. Thus, GNKH also hasimplication in the diagnosis and treatment of chromosome 18p-relatedprocesses and disorders such a neuropsychiatric disorders (e.g., BAD).

16.1. MATERIALS AND METHODS

[0590] A BLASTN (program version 1.4) search against the dbEST database(Boguski et al., 1993, Nature Genetics 4:332-333) was performed toidentify ESTs with significant similarity (i.e., ESTs having p valuesequal to or less than 3×10⁻¹⁴) to HKNG1 cDNA or to its complementarysequence (i.e., to the complementary strand). ESTs identified by theBLASTN search were assembled “in silico” along with the HKNG1 cDNAsequence using the TIGR assembly package, (See Sutton et al., 1995,Genome Sci. & Tech. 1:9-19), followed by DNAStar SeqMan (from DNAStarInc., Madison, Wis.) and Sequencher programs (from Gene Codes Corp., AnnArbor, Mich.) according to manufacturer's instructions. After the BLASTNsearch, iterative rounds of BLASTN were performed to identify othersequences in the public databases with similarity to assembled contigsequences followed by the assembly of the hits above a given thresholdof similarity. The BLASTN search was implemented using the followingparameters: threshold (E)=10; DNA word length, 11. The threshold ofsimilarity for assembly was set such that hits must show at least 90%identity over a minimum of 50 bp.

[0591] To verify the existence of a gene encoded by the DNA fragmentassembled by the IBLAST program, 5′ and 3′ RACE was performed by usingClontech Marathon Ready cDNA derived from brain, kidney and retina withthe following primers, designed from the GNKH in silico contig: 5′ RACEPrimers: P193 and AP1 P193 5′-ACGCCGCGGGCCCCTGCGGGACGGGT-3′ (SEQ IDNO:69) AP1 5′-CCATCCTAATACGACTCACTATAGGG (SEQ ID NO:70) C-3′ 3′ RACEPrimers: P195 and AP1 P195 5′-GGAGCCGCTGGGACGCGGCTTACCTC-3′ (SEQ IDNO:71) AP1 5′-CCATCCTAATACGACTCACTATAGGG (SEQ ID NO:72) C-3′

[0592] The EST clones from which the in silico contig was derived werealso obtained. PCR was performed by using a Clontech Advantage-GC cDNAPCR Kit with 5 μL of the above-described cDNA. Briefly, the cyclingparameters for the PCR reaction were as follows: the sample wasincubated for 3 minutes at 95° C. followed by two repeats of a cyclewherein the sample was incubated for 30 seconds at 95° C., for 30seconds at 72° C., and for one minute at 72° C. The annealingtemperature was then lowered by 2° C. every two cycles until thetemperature reached 62° C., followed by 25 repeats of a cycle whereinthe sample was incubated at 95° C. for 30 seconds, at 55° C. for 30seconds, and at 72° C. for one minute. Finally, the sample was incubatedfor 7 minutes at 72° C. and stored at 4° C. until gel purification. TheDNA thus obtained was then gel purified from regions with bands andligated into pGem T Easy. Positive clones were sequenced using standarddye-terminator chemistry.

[0593] The consensus sequence of the contig was mapped to the humanchromosome 18p genomic sequence using the publicly available programEST2genome set to default parameters (see Mott R., 1997, ComputerApplications in the Biosciences, 13(4):477-8).

[0594] BLASTX searching was also done using standard parameters topredict protein sequences that might be encoded by the novel gene.

[0595] Northern analysis was performed to identify tissues that expressGNKH. Clontech human MTN blot IV and Clontech human brain blot II and IVwere probed. The probe used in the Northern analysis was a gel-purifiedGNKH-specific PCR fragment generated from Clontech Marathon-ready braincDNA using primers P193/P195 (see above). The probe fragment correspondsto nucleotides 438-679 of GNKH DNA sequence as depicted in FIG. 28. Theprobe was labeled with [α-³²P]dATP (6000 Ci/mmol) by random-primingusing Promega's Prime-a-Gene Labeling System and followingmanufacturer's instructions. The blots were prehybridized at 68° C. for1 hr in 15 ml ExpressHyb solution (Clontech) in roller bottles. Theprobe was denatured by heating to 100° C. for 5 minutes and quicklychilling on ice. Hybridization was for 1.5 hr at 68° C. in 15 ml freshExpressHyb solution containing 1×10⁶ cpm/ml probe and 15 μg/ml sheared,denatured salmon sperm DNA. Blots were washed three times, each for 20min. at 68° C. in 2×SSC, 0.05% SDS followed by two 20-min. washes at 68°C. in 0.1% SSC, 0.1% SDS. Filters were then wrapped in plastic wrap,exposed to a phosphor storage screen, and scanned on a Storm 860Phosphorimager (Molecular Dynamics).

16.2. RESULTS

[0596] Iterative BLASTN searching of HKNG1 cDNA against the dbESTdatabase identified a number of ESTS with similarity to HKNG1. TheseESTS were assembled using the Gene Codes Sequencher program as describedabove. The assembly is depicted schematically in FIG. 24. Two contigs ofinterest were identified, which are depicted schematically in FIG. 25.

[0597] The first contig, referred to herein as Contig 1, comprised ESTsidentified by the GenBank Accession NOs: R61492, AA317281, AA639918,AI654367, H91726, H91647, G26658, C20640, R61493, H81803, AA361367, andwas assembled using HKNG1 cDNA. The contig extends approximately 446bases further downstream from the longest previously identified cDNAsequence.

[0598] Five of these ESTs (GenBank Accession Nos.: H91647, C20640,R61493, H81803 and AA361367) were found to extend downstream of both thepublished sequence of the rod photoreceptor protein (Shimizu-Matsumoto,A. et al., 1997, Invest. Ophthalmol. Vis. Sci. 38:2576-2585) and theoriginal HKNG1 sequence described in Section 7, above. One of theseESTs, H81803 was ordered and sequenced. It was found to extend the HKNG1sequence by a total of 565 bases downstream of the original sequence,before reaching a polyA tract. These additional 565 base pairs ofsequence are shown in FIG. 26 (SEQ ID NO:73). All but the last 52 basesof this sequence are in good agreement with the HKNG1 genomic sequence,as depicted in FIGS. 3A-0-3A-28. The break in homology at the 3′ end ofthe gene may indicate an additional exon, although no sequencecorresponding to this 52 bp was identified in the BAC sequence.

[0599] The second contig, referred to herein as Contig 2, does notassemble with HKNG1 cDNA. However, a BLASTN search revealed that thiscontig does have short stretches of identity with the previouslypublished sequence of rod photoreceptor protein/HNKG1(Shimizu-Matsumoto, A. et al., 1997, Invest. Ophthalmol. Vis. Sci.38:2576-2585) and with a second gene, known as thymidylate synthase orTS (Hori et al., 1990, Hum. Genet. 85:576-580). Previous sequencing ofthe human chromosome 18p region has shown that exon 1 of TS liesapproximately 6.5 kb downstream of the 3′ end of HKNG1 exon 11.

[0600] The contig formed by assembling these ESTs reveals a separate,novel gene which contains a short stretch of identity to both HKNG1 andTS. This novel gene is referred to herein as GNKH. Alignment of the GNKHsequence with the genomic sequence spanning HKNG1 and TS reveal that thecoding strand for GNKH lies on the strand opposite that of HKNG1 and TS.When the ESTs comprising contig 2 were ordered and sequenced, additional5′ sequence information was yielded, such that the GNKH contig of 1161bp was obtained, as depicted in FIG. 28 (SEQ ID NO:74). The first 424 bpof GNKH is sequence was not available in the dbEST database and wasinstead derived by complete sequencing of the following ESTs: AA993470,AA782906, AA629821, A1369817, AA554172, and AI361601. This portion ofthe GNKH sequence is complementary to a portion of the TS genomicsequence (GenBank Accession No. D00596). Specifically, the first 789 bpof the GNKH sequence are complementary to the sequence consisting ofnucleic acid residues 1099-1881 of the TS genomic sequence. FIG. 27schematically illustrates the positions of the above-described publiclyavailable ESTs which align to the 1161 bp GNKH contig.

[0601] Two potential single nucleotide polymorphisms (SNPs), (C/T)207and (C/G)566, were also identified in the sequenced GNKH contig.

[0602] Using the program EST2genome, the consensus sequence of the GNKHcontig was aligned to a 68 kb stretch of chromosome 18 genomic sequencewhich includes HKNG1 exons 1-11, TS exon 1 and part of TS intron 1. FIG.29 shows the schematic alignment of HKNG1/TS genomic DNA to GNKH cDNAand demonstrates that GNKH overlaps with both exonic and intronicsequences of the HKNG1/TS genomic DNA, with the dotted lines indicatingthe region of overlap with exonic sequence. In FIG. 29, GNKH is depictedin the 3′-5′ orientation to highlight its relationship to HKNG1 and TS,and AAAA signifies the presence of a polyA tail. FIGS. 30A and 30B showthe detailed alignment of the GNKH reverse compliment (RCGNKHEXP) toboth exonic and intronic sequences of genomic HKNG1 and TS. Thisalignment reveals that the GNKH contig contains 2 putative exonsinterrupted by an 8 kb intron. The presence of canonical splicedonor/acceptor sites at the 5′/3′ ends of the putative intron isconsistent with this model. A consensus AAUAAA polyadenylation signal isfound at bases 1109-1114 of GNKH; a number of clones were found to bepolyadenylated at this site. A second polyadenylation signal is alsoobserved at bases 895-900; some of the ESTs and RACE products wereobserved to possess a polyA tail immediately downstream of this site.These findings are all consistent with the hypothesis that GNKHrepresents a gene located on the opposite strand to HKNG1 and TS, andextending into the 25 kb BAD critical region described in Section 6,above.

[0603] Interestingly, one of the 6 genes lying in the original 340 kbcritical region, rTS, is a naturally occurring antisense RNA which isknown to have complimentarity to the TS gene (Dolnick, Nuc. Acids res.21:1747-1752). FIG. 31 illustrates the relationship of the 4 genesencoding HKNG, TS, rTS and GNKH. Both rTS and GNKH lie on the oppositestrand to HKNG1 and TS, and both overlap with the TS gene. Only GNKHextends into the critical 27 kb region described, above, in Section 6which has been implicated in BAD.

[0604] As depicted in FIG. 31, the last exon of HKNG1, and the first andlast exon of TS are represented as boxes, separated by intron sequence(solid line). GNKH and rTS are represented as boxes (exons) separated byspliced out introns (solid lines) with approximate intron sizes shown.Dashed lines represent the 13 kb of intervening genomic sequence whichlies between GNKH and rTS. AAA represents predicted polyadenylationsites. Both rTS and GNKH lie on the opposite strand to HKNG1 and TS, andboth overlap with the TS gene. Only GNKH extends into the critical 27 kbregion, which has been implicated in BAD, and aligns to both exonic andintronic sequences of HKNG1 and TS genes.

[0605] A BLASTX search of the forward strand of the GNKH fragmentagainst the protein database detected no significant homologies to knownproteins. Predicted amino acid sequences were obtained for the twolongest open reading frames (ORFs) found in the GNKH sequence, asdepicted in FIGS. 32 and 33 (SEQ ID NOS: 75 and 76, respectively). TheseORFs encoded peptides of 123 and 111 amino acids, respectively (SEQ IDNOS: , respectively). Searching of these 2 peptide sequences against thePROSITE (Hofmann et al., 1999, Nuc. Acids Res. 27:215-219; Bucher andBairoch, 1994, Ismb 2:53-61.) and PFAM (Bateman et al., 1999, Nuc. AcidsRes. 27:260-262) databases also failed to reveal any known patterns ormotifs.

[0606] Northern blots identified a single GNKH transcript of 1.3 kb inall nervous tissue examined (cerebellum, cerebral cortex, medulla,spinal cord, occipital pole, frontal lobe, temporal lobe, putamen,amygdala, caudate nucleus, corpus callosum, hippocampus, whole brain,substantia nigra, and thalamus) and in non-neuronal thymus and smallintestine by Northern analysis. A larger transcript of 1.8 kb wasidentified by Northern blots in testis. Spleen, prostate, uterus, colon,and peripheral blood leukocytes did not express detectable levels of anyGNKH transcript.

17. EXAMPLE Identification of GNKH Polymorphisms

[0607] This Example describes experiments performed, using geneticsamples from BAD-affected and non-BAD-affected individuals, to identifymutations and/or polymorphisms of the GNKH transcript in thoseindividuals. Several specific polymorphisms identified in theexperiments are also described hereinbelow which may be used, e.g., inthe diagnostic, prognostic and therapeutic methods of the presentinvention.

17.1. MATERIALS AND METHODS

[0608] Pairs of PCR primers that flank each GNKH exon (see Table 8) weremade and used to PCR amplify genomic DNA isolated from BAD affected andnormal individuals. The amplified PCR products were analyzed by DNAsequencing. The DNA sequences of the affected and controls were comparedand variations were further analyzed. TABLE 8 EXON Sequence DirectionExon 1 5′-AACGGCTGCCTAACGT (SEQ ID NO:77) forward CCTGT-3′5′-GGAGAGCTGCCTGGGC (SEQ ID NO:78) reverse TTGA-3′ Exon 15′-TTGAAAACGCTGCGAA (SEQ ID NO:79) forward GCGGAAT-3′5′-CGCTACAGCCTGAGAG (SEQ ID NO:80) reverse GTGA-3′ Exon 15′-AGGATTGAGGTTAGGA (SEQ ID NO:81) forward CTAAACG-3′5′-TGGCGCACGCTCTGTA (SEQ ID NO:82) reverse GAGC-3′ Exon 25′-CCATTCAACATAAGTA (SEQ ID NO:83) forward AACTAAGAG-3′5′-GCTTTTGTAGATGGGC (SEQ ID NO:84) reverse TCTTAC-3′

17.2. RESULTS

[0609] Exon scanning experiments were performed using genetic samplesfrom both BAD-affected and non-affected individuals to identifypolymorphisms and mutations that can be used, e.g., in the diagnosisand/or prognosis of patients that have or are susceptible to a bipolaraffective disorder. Specifically, exon scanning was performed on the twoexons of the GNKH gene using chromosomes isolated from threeBAD-affected and one normal individual from the Costa Rican populationutilized for the LD studies discussed, above, in Section 6.

[0610] At least five variants in the GNKH transcript were identified.These variants are listed in Table 9, below, with respect to the GNKHsequence shown in FIG. 28 (SEQ ID NO:74). Column three of this tableindicates the appropriate location of each polymorphism with respect tothe opposite strand (i.e., the strand encoding HKNG1 and TS). The actuallocation corresponding to the GNKH sequence as depicted in FIG. 28.TABLE 9 Position (GNKH; Fig. 28, SEQ ID NO:74) Polymorphism Location(opposite strand) 200 G−>C TS intronic region (intron 1) 207 T−>C TSintronic region (intron 1) 566 G−>C TS intronic region (intron 1) 859poly A stretch: HKNG1 intronic region (A)_(n)(n ≈ 15) (intron 10) 993A−>G HKNG1 intronic region (intron 10)

[0611] Each of the polymorphisms depicted in Table 9, above, may beused, e.g., in the methods and compositions of the present invention. Inparticular, the polymorphisms are useful, e.g., in further associationstudies to identify mutations and/or polymorphisms of the GNKH gene thatare associated with bipolar affective disorder, and which, accordingly,can be used in the methods and compositions of the present invention forthe diagnosis, prognosis and/or treatment of such disorders.

18. EXAMPLE Identifying Variations in HKNG1 Expression or Activity WhichCorrelate With Bad

[0612] This Section describes, in detail, exemplary and non-limitingmethods which can be used to identify variations in HKNG1 amongindividuals, and to determine whether such variations correlate with abipolar affective disorder. Specifically, the experiments described inthis Section can be used to detect variations of the level of HKNG1 mRNAin cell samples from BAD-affected and control (i.e., non-BAD affected)patients. For example, in one preferred embodiment, the cell samples arecell lines, for example lymphoblast cell lines, from BAD-affected andcontrol individuals. In another embodiment, the samples may be tissuesamples such as brain tissue samples, from BAD-affected and controlindividuals. The skilled artisan readily appreciates, however, that anycell, cell line or tissue sample could be used in such methods.

[0613] Such variations can then be used, e.g., to diagnose BAD inindividuals as well as to identify individuals predisposed to BAD, bydetecting the presence or absence of the variation in a genetic sampleobtained from an individual suspected of having or of being predisposedto a BAD condition. The therapeutic methods and compositions of theinvention can also be used to treat individuals for BAD, e.g., byreversing or neutralizing the variance in HKNG1 in the individual.

[0614] In more detail, HKNG1 mRNA expression levels can be evaluated,according to the following methods, in samples, e.g., from cell linesobtained from patients suffering from BAD. For example, lymphoblastcells or other cells known to express HKNG1 can be isolated frompatients suffering from BAD and cultured as a cell line. The HKNG1 mRNAexpression levels in such cells can then be compared to HKNG1 mRNAexpression levels in cells, preferably from the same type of cells,isolated from patients not suffering from BAD (i.e., from non-affectedindividuals). Such “control” cell lines can be readily obtained, e.g.,from the American Type Culture Collection (ATCC).

[0615] mRNA can be extracted from such cell lines and use, e.g., inTaqman PCR experiments, to determine the amount or level of HKNG1expressed in cells, e.g., by amplifying and detecting the mRNA samplesunder a standard program on an ABI Prism 7700 Sequence Detection System(PE Applied Biosystems). Preferably, HKNG1 mRNA levels are compared to asuitable internal control, such as GAPDH (glyceraldehyde-3-phosphatedehydrogenase), whose mRNA levels are measured in the same cell lines.mRNA levels measured from such an internal control can then serve tonormalize the HKNG1 mRNA levels measured for the different cell lines.Exemplary primer sequences that can be used in the PCR amplification ofboth HKNG1 and GAPDH are provided below in Tables 10 and 11,respectively. TABLE 10 HKNG1 Conc. Nucleotide Sequence Primers 200 nMGGAACACACCAATCTAATGAGCAC (forward) (SEQ ID NOS:85-87) 200 nMGTTGGCAGGTTGTATAAATTCTCATGCAG (reverse) Probe 100 nM6FAM-AGGCTATGCCGGGAGTCTTTGGCAGATTCC

[0616] TABLE 11 GAPDH conc. Nucleotide Sequence Primers  80 nMGAAGGTGAAGGTCGGAGTC (forward) (SEQ ID NOS:88-90)  80 nMGAAGATGGTGATGGGATTTC (reverse) Probe 100 nM JOE-CAAGCTTCCCGTTCTCAGCC

[0617] Routine techniques of statistical analysis can be readily used bythose skilled in the art to determine whether variations of HKNG1 mRNAlevels correlate with BAD. Preferably, any correlations identified bysuch techniques are subsequently verified, e.g., using larger, andtherefore statistically more robust, samples. Differences in HKNG1 mRNAexpression levels that are thus identified and confirmed to correlatewith BAD can then be used in both the diagnostic and prognosticevaluation of patients who are suspected of suffering from a BAD or aresuspected of being predisposed to a BAD. For example, mRNA levels ofHKNG1 can be measured from cell lines obtained from a patient andcompared to HKNG1 mRNA levels both in cell lines obtained from normalindividuals not suffering from or predisposed to BAD, and in cell linesobtained from individuals who are suffering from or predisposed to BAD.

[0618] Variations in HKNG1 expression can also be exploited in themethods of the invention to treat BAD by reversing and/or neutralizingthe variation in a patient, e.g., using the methods described, above, inSection 5.7, e g., to either reduce or increase levels of HKNG1 mRNAexpressed in a patient or in an appropriate cell population orsubpopulation of the patient.

19. EXAMPLE Identification of Rat HKNG1

[0619] The Example presented in this Section describes the isolation andidentification of a rat homolog of human HKNG1 and its predicted aminoacid sequence.

19.1. MATERIALS AND METHODS

[0620] Reverse Transcription of Rat Retina mRNA:

[0621] Rat retina mRNA (Clontech) was used to clone a partial rat HKNG1cDNA spanning the entire coding sequence of the rat HKNG1 gene.Specifically, 2 μg rat retina mRNA was reverse transcribed with LifeTechnologies Superscript II reverse transcriptase according to themanufacture's instruction. 0.5 M NaOH was added to the reversetranscription reaction product to a final concentration of 150 mM andboiled for five minutes followed by addition of an equal volume of 0.5 MHCL and dilution to 200 μL with TE buffer (pH 8.0).

[0622] MOPAC Cloning of a Partial rat HKNG1 cDNA Fragment:

[0623] An aliquot of the reverse transcribed rat retina mRNA, describedabove, was used to clone a partial fragment of rat HKNG1 cDNA byadopting the Multiple Oligo Primed Amplification of cDNAs or “MOPAC”technique described, e.g., by Lee et al., 1988, Science 239:1288-1291.In particular, MOPAC fragments were amplified from the resulting cDNA inprimary and secondary PCR reactions using the primers listed in Table13, below. TABLE 13 Reaction Primer Name Primer Sequence PrimaryHK9/10(1) 5′ CTG(AG)TGGAGAAGATGAGAG(AG)GCA (SEQ ID NOS:91-96)HK9/10(−1A) 3′ TTTAAA(AG)TG(CT)TCCTTAAAATGCTG HK9/10(−1B) 3′TTTAAA(AG)TG(CT)TCCTTAAAGTGCTG Secondary HK9/10(2A) 5′GATGAGAG(AG)GCA(AG)TTTGGCTGGGT HK9/10(2B) 5′GATGAGAG(AG)GCA(AG)TTTGGTTGGGT HK9/10(−2) 3′ GAGTGTGAA(AG)TTAGAGGAAGGCAG

[0624] Specifically, the primary PCR reaction was carried out by pooling20 μl of the cDNA product (i.e., one-tenth of the 200 μl reversetranscription product) in a total of 100 μl of 1.1× Taq buffer (PerkinElmer), 200 μM dNTPs, 5 units AmpliTaq Gold polymerase and 0.55 μM senseprimary primer HK9/10(1) in TABLE 13. The 100 μl was divided into two 45μl aliquots, and 5 μL of antisense primary primers HK9/10(−1A) andHK9/10(−1B), shown in Table 13, above, were added to the first andsecond aliquot, respectively, each at a final concentration of 0.5 mM.Each 50 μl aliquot was further divided into five 10 μL aliquots andtransferred to thin wall PCR tubes. The aliquots were each heated to 95°C. for 10 minutes to activate the AmpliTaq polymerase, and cycled atfive separate annealing temperatures through the following PCR cycle:(95° C. for 30 seconds, incubation at one of the five annealingtemperatures for 30 second, and 75° C. for 20 seconds)x 29, usingannealing temperatures of 52.5°, 55°, 57.5°, 60°, and 62.5° C.respectively for each of the five aliquots.

[0625] Twenty secondary PCR reactions were carried out in 100 μLvolumes. Reaction conditions were as described above except 1 μL of eachprimary reaction was used as template and the 3′ and 5′ secondaryprimers listed in Table 13, above, were utilized. Specifically, all ofthe secondary reaction mixtures used the 3′ secondary-primer HK9/10(−2)shown in Table 13. Half of the secondary reaction mixes used the 5′secondary A primer HK9/10(2A), while the other half used the 5′secondary B primer, i.e., HK9/10(2B). Thus, primary and secondary PCRreactions were carried out for four different combinations of the 5′ Aand B primers, as shown below in Table 14. The secondary PCR reactionwas run using the same cycle and temperatures and described above forthe primary PCR reaction. TABLE 1 Reaction Primer AA AB BA BB Primary 5′HK9/10(1) HK9/10(1) HK9/10(1) HK9/10(1) 3′ HK9/10(−1A) HK9/10(−1A)HK9/10(−1B) HK9/10(−1B) Secondary 5′ HK9/10(2A) HK9/10(2B) HK9/10(2A)HK9/10(2B) 3′ HK9/10(−2) HK9/10(−2) HK9/10(−2) HK9/10(−2)

[0626] The final PCR products were subcloned into pCR II Topo using theTopo TA cloning kit from In Vitrogen, and the resulting colonies werepicked into 2 ml cultures. 1.5 ml of each culture was used in a QiagenTip 20 purification kit and the purified cDNA was sequenced with ³³Pusing the Sequenase kit from Amersham.

[0627] 3′ RACE Cloning of a rat HKNG1 cDNA Fragment:

[0628] A cDNA fragment of the rat HKNG1 gene was isolated from ratretinal mRNA using the 3′ RACE protocol of Frohman et al., 1988, Proc.Natl. Acad. Sci. U.S.A. 85:8998-8990. Specifically, 2 μg of rat retinalmRNA (Clontech) was reverse transcribed using Life TechnologiesSuperscript II reverse transcriptase according to the manufacturer'sdirections. The following 3′ oligonucleotide was used as a primer:5′-CACACCAGTAGACCCACACAGCCACCATCGA (SEQ ID NO:97)TGCGGCCGCGGATCCATTTTTTTTTTTTTTTTTT T-3′.

[0629] The reaction was terminated by adding 0.5 M NaOH to a finalconcentration of 150 mM and boiling for 5 minutes, followed byneutralization by adding the same volume of 0.5 M HCl and dilution to200 μL by the addition of TE.

[0630] The resulting single stranded cDNA product was then amplified bypolymerase chain reaction (PCR) using primers derived from the first ratHKNG1 partial cDNA isolated in the MOPAC experiments described above.Specifically, the following primer were used: Primer Reaction NamePrimer Sequence Primary rHK-WVSQ 5′-TGGGTGTCTCAACTGGCAAGCCAT-3′ RACE-1°5′-CACACCAGTAGACCCACACAGCCA-3′ Secondary rHK-HNPV5′-CATAACCCAGTGACTGAGGACATC-3′ RACE-2° 5′-ACCATCGATGCGGCCGCGGATCCA-3′

[0631] (SEQ ID NOS:98-101)

[0632] One tenth of the cDNA was added to a 100 μL reaction samplecontaining: 5 units of Amplitaq Gold (Perkin Elmer); 0.5 μM of theprimer rHK-WVSQ; 0.5 μM of the primer RACE-1°; 1× Taq Buffer (PerkinElmer); and 200 μM dNTPs (Pharmacia). Four 22 μL aliquots were takenfrom this reaction sample at each aliquot was PCR cycled at annealingtemperatures of 57.5° C., 60° C., 62.5° C. and 65° C., respectively,according to the following protocol:

[0633] (i) incubate at 95° C. for 10 minutes (to activate the Amplitaqpolymerase);

[0634] (ii) incubate at 96° C. for 30 seconds;

[0635] (iii) incubate at the indicated annealing temperature for 30seconds;

[0636] (iv) incubate at 75° C. for one minute; and

[0637] (v) repeat steps (ii)-(iv) 29 additional times.

[0638] 100 μL secondary PCR reaction mixture was prepared containing: 5units Amplitaq Gold; 0.5 μM of the primer rHK-HNPV; 0.5 μM of the primerRACE-2°; 1× Taq Buffer (Perkin Elmer); and 200 μM dNTPs (Pharmacia).Four 24 μL aliquots of the secondary PCR reaction mixture weretransferred into separate test tubes, and 1 μL of each primary PCRreaction product was added to each tube. Specifically, 1 μL of theprimary PCR reaction product prepared by annealing at 57.5° C. was addedto one test tube, 1 μL of the primary PCR reaction product prepared byannealing at 60° C. was added to another test tube, and so forth. Eachof these secondary reaction mixtures was then PCR cycled at 57.5° C.,60° C., 62.5° C. and 65° C., respectively, according to theabove-described cycling protocol.

[0639] 20 μL of each PCR reaction was electrophoresed in a 1%(weight/volume) low melt agarose gel (Sea Plaque, FMC) and an intenseband of approximately 300 base pairs in length was observed from thereactions at all four temperatures. The band was excised from the gel,melted at 70° C. and then cooled to 37° C. The cooled but still moltengel was used as a template with a TOPO cloning kit (Invitrogen) tosubclone the PCR product into PCR II according to the manufacturersdirections. Six white colonies resulting from the transformation of theTOPO reaction were picked into BHI media and plasmid DNA was isolated byminiprepping (Qiagen Tip 20). DNA from each of these six colonies wasmanually sequenced (Sequenase 2.0, Amerasham) using M13 forward and M13reverse primers according to the manufacturers directions.

[0640] MOPAC Cloning of a Second Partial rat HKNG1 cDNA:

[0641] A second rat HKNG1 partial cDNA was also cloned using theMultiple Oligo Primed Amplification of cDNAS (MOPAC), described above.This second MOPAC experiment used an antisense rat HKNG1 primer derivedfrom the partial cDNA sequence obtained in the first MOPAC experiment toobtain a rat HKNG1 cDNA, described below in Section 19.2, that includedall but the 5′ untranslated region and the coding region for theamino-terminus rat HKNG1 gene product.

[0642] Specifically, the following four degenerate sense primers weresynthesized based on coding sequences for the amino-terminal of thehuman, bovine and guinea pig HKNG1 gene products: =Primer Name PrimerSequence HK 5′conA 5′-CA(GATC)TG(CT)GG(AG)CC(TC)ACAGGGAAGGA-3′ (SEQ IDNOS:102-105) HK 5′conB 5′-CA(GATC)TG(CT)GG(AG)CC(TC)ACATGGAAGGA-3′ HK5′conC 5′-CA(GATC)TG(CT)GG(AG)CC(TC)ACTTGGAAGGA-3′ HK 5′conD5′-CA(GATC)TG(CT)GC(AG)CC(TC)ACTGGGAAGGA-3′

[0643] Nucleotides in parentheses indicate degenerate sequences. Forexample (GATC) indicates the 25% of the primers had a guanine at theindicated position, 25% of the primers had an adenine at the indicatedposition, 25% of the primers had a thymine at the indicated position,and 25% of the primers had a cytosine at the indicated position. (AG)indicates that 50% of the primers had an adenine at the indicatedposition and 50% had a guanine at the indicated position.

[0644] An antisense rat HKNG1 primer was derived from the first partialrat HKNG1 cDNA sequence obtained in the first MOPAC experiment describedabove, and had the following name and sequence: !Primer Name? PrimerSequence rHK AS HGGD 5′-CTGCTTGGAAGAATCTCCT (SEQ ID NO:106) CCATG-3′

[0645] Four 100 μL PCR reactions were prepared, each containing: 1/20thof the rat retina cDNA reaction product; 5 units Amplitaq Gold; 0.5 μMof one of the the HK 5′con degenerate primers; 0.5 μM of the rHK AS HGGDprimer; and 200 μM dNTPs (Pharmacia). In particular, the four PCRreaction contained 0.5 μM of the primer HK 5′conA, HK 5′conB, HK 5′conCand HK 5′conD, respectively. Each of these four 100 μL PCR reactions wasdivided in four 22 μL aliquots, and each aliquot was PCR cycled atannealing temperatures of 57.5° C., 60° C., 62.5° C. and 65° C.,respectively according to the following protocol:

[0646] (i) incubate at 95° C. for 10 minutes (to activate the Amplitaqpolymerase);

[0647] (ii) incubate at 96° C. for 30 seconds;

[0648] (iii) incubate at the indicated annealing temperature (i.e., at57.5° C., 60° C., 62.5° C. or 65° C.) for 30 seconds;

[0649] (iv) incubate at 75° C. for two minutes; and

[0650] (v) repeat steps (ii)-(iv) 29 additional times.

[0651] Thus, a PCR aliquot for each of the four sense primers describedabove was PCR cycled at each of the four above-listed annealingtemperatures, for a total of sixteen separate PCR reactions.

[0652] 20 μL from each PCR reaction was electrophoresed in a 0.4%(weight/volume) low melt agarose gel (Seq Plaque, FMC). An intense bandof the expected size (i.e., of about 1.2 kb) was observed in thereaction produces prepared from all four PCR annealing temperatures, andwas most prominent for the reactions with the third degenerate primer(i.e., the primer designated HK 5′conC). The bands were excised, meltedat 70° C. and allowed to cool to 37° C. The cooled but still molten gelwas used as a template with an Invitrogen TOPO cloning kit to subclonethe PCR product into PCR II. Six white colonies resulting from thetransformation of the TOPO reaction were picked into BHI media and theplasmid DNA was isolated by miniprepping (Qiagen Tip 100). DNA from eachof these six colonies was manually partially sequenced (Sequenase 2.0,Amersham) using M13 forward and M13 reverse primers. An initial readconfirmed that this partial cDNA corresponded to a full length HKNG1sequence, and the cDNA was sequenced in its entirety according toroutine, automated sequencing methods

[0653] PCR Amplification of Full Length rat HKNG1 cDNA:

[0654] The full length coding cDNA of rat HKNG1 was isolated by PCRusing primers derived from a published EST sequence discussed below.Specifically, a forward primer, designated rHK 5′UTR1, was designed froma published EST sequence which overlapped with the 5′-end of the partialcDNA sequence isolated in the second MOPAC experiment, describedhereinabove. A reverse PCR primer, designated rHK 3′UTR1, was designedfrom the complementary sequence of the 3′-UTR rat HKNG1 cDNA sequenceobtained by the above described 3′ RACE experiments. The primersequences are provided below: Primer Name Primer Sequence (SEQ IDNOS:107-108) rHK 5′UTR1 5′-TGTAAAACGACGGCCAGTGCGGCA (forward)CGAGGCACATCGTAAAAAGTG-3′ rHK 3′UTR1 5′-CAGGAAACAGCTATGACCCCTACC(reverse) CTCTCAACAAAGCTTTCC-3′

[0655] Five 100 μL reaction samples were prepared, each containing:1/20th of the above described rat retina cDNA reaction, 1.0 μM of therHK 5′UTR1 primer; 1.0 μM of the rHK 5′UTR2 primer; 1× ExTaq buffer(Takara Biomedicals); and 200 μM dNTPs (Pharmacia). Each of the fivereaction samples was incubated at 95° C. for 5 minutes, after which theywere “hot-started” by adding five units of ExTaq DNA polymerase to eachreaction sample. Each of the five reaction samples was then cycled 30times according to the following PCR cycling protocol: (i) incubating at95° C. for 30 seconds; (ii) incubating for 30 seconds at an annealingtemperature of 65° C.; (iii) and incubating at 75° C. for 2 minutes.

[0656] After completing the PCR cycles, the five reaction samples werepooled, ethanol precipitated and electrophoresed on a 0.4%(weight/volume) preparative low melt agarose gel (SeaPlaque, FMC). A gelslice harboring a prominent PCR product approximately 1.6 kb in lengthwas excised from the gel, melted at 70° C., diluted up to 0.5 mL andsubjected to digestion with β-agarase (New England Biolabs). Afterdigestion, the sample was phenol extracted twice, chloroform extractedtwice, and ethanol precipitated. The resulting purified PCR product wassequenced using standard automated sequencing techniques.

19.2. RESULTS

[0657] A rat homolog of the human HKNG1 gene was cloned and sequencedfrom rat retina mRNA in four separate steps. First, a partial cDNAfragment, corresponding to a region near the 3′-end of the coding regionfor a rat HKNG1 gene product, was isolated according to the abovedescribed MOPAC experiment. The cDNA sequence of this fragment isdepicted in FIG. 34 (SEQ ID NO:109). FIG. 34 (SEQ ID NO:110) shows thepredicted amino acid sequenced encoded by this fragment. This amino acidsequence was aligned to the amino acid sequences of the human, bovineand guinea pig HKNG1 gene product sequences provided herein and as shownin FIG. 35, confirming that the isolated rat gene product depicted inFIG. 34 (SEQ ID NO:110) is homologous but not identical to thepreviously isolated HKNG1 gene products. Thus, the cDNA sequencedepicted in FIG. 34 (SEQ ID NO:109) is likely to be a rat HKNG1ortholog.

[0658] Next, a second partial cDNA was isolated by 3′ RACE, as describedabove in Section 19.1. This second fragment included sequence encodingthe carboxy-terminus of the rat HKNG1 gene product as well as portionsof the 3′-untranslated region (i.e., non-coding sequence) of a fulllength rat HKNG1 cDNA. The sequence of this second cDNA fragment isshown in FIG. 36A (SEQ ID NO:111), whereas FIG. 36B (SEQ ID NO:112)shows the predicted amino acid sequence encoded by the cDNA fragment.This predicted amino acid sequence was confirmed to be thecarboxy-terminal sequence of a rat HKNG1 gene product by visuallyaligning and comparing it to the human, bovin, and guinea pig HKNG1 geneproduct sequences disclosed herein.

[0659] Using (a) degenerate sense primers designed from highly conservedamino-terminal sequences of the human, guinea pig and bovine HKNG1 genesdisclosed above, and (b) an antisense primer derived from the first ratHKNG1 cDNA fragment shown in FIG. 34 (SEQ ID NO:109), a third, largerrat HKNG1 cDNA fragment was isolated and cloned in another MOPACexperiment, described in Section 19.1, above. The sequence of this thirdcDNA fragment is depicted in FIG. 37A (SEQ ID NO:113). FIG. 37B (SEQ IDNO:114) shows the predicted amino acid sequence encoded by this cDNAfragment.

[0660] A published rat EST sequence (GenBank Accession No. AI715798) wasidentified that overlapped substantially with the rat HKNG sequenceshown in FIGS. 37A-B (SEQ ID NOS:113-114). Specifically, the ESTsequence AI715798 is a known EST whose sequence is shown in FIG. 38A(SEQ ID NO:115). The EST's complementary sequence is shown in FIG. 38B(SEQ ID NO:116) and is predicted to encode the amino acid sequence:

[0661] RHEAHRKK*RSFQKLVAISLGRAAISVEHWTMQPPLFVISVYLLWLKYCDSAPTWKETDATDGNLKSLPEVGEADVEGEVKKALIGIKQMKIMMERREEEHAKLMKALKKKKK (also shown inFIG. 38C; SEQ ID NO:117) The asterix indicates a STOP codon appearing inthe reading frame of the EST sequence.

[0662] This predicted amino acid sequence overlaps substantially withthe rat HKNG1 amino acid sequence depicted in FIG. 37B, as indicated bythe amino acid residues depicted in underlined, italicized type above;i.e., the polypeptide sequence:

[0663] TDATDGNLKSLPEVGEADVEGEVKKALIGIKQMKIMMERREEEHAKLMKALKKKK K (SEQ IDNO: 118) corresponds to both the amino-terminal sequence of SEQ IDNO:117 shown above and in FIG. 38C, and the carboxy-terminal sequence ofSEQ ID NO:114 shown in FIG. 37B. It was concluded, therefore, that thecomplement of the EST AI715798 is also a partial rat HKNG1 cDNAsequence. New PCR primers were therefore designed using predicted 5′ UTRsequence from this EST sequence and the 3′ Untranslated rat HKNG1 cDNAsequence generated by the above-described 3′ RACE experiments, and usedto isolate a cDNA encoding a full length rat HKNG1 gene product asdescribed in Section 19.1 above. The sequence of this rat HKNG1 cDNA isshown in FIG. 39A (SEQ ID NO:119), and the predicted amino acid sequenceof the full length rat HKNG1 gene product that it encodes is shown inFIGS. 39B-1 and 39B-2 (SEQ ID NO:120).

[0664] The isolation of the original rat HKNG full length clonesdescribed above also led to the identification of two naturallyoccurring rat HKNG full length clone variants which were isolated fromSprague-Dawley rats. The first of the naturally occurring rat HKNG fulllength clone variants, which is referred to herein as rHKNG1I, containeda single nucleotide substitution. In this embodiment of the rat HKNGfull length variant clone, the nucleotide at position 816 is a thymine(T)(SEQ ID NO:134). The cDNA sequence of this rat HKNG full length clonevariant is depicted in FIG. 40A (SEQ ID NO:134). In this embodiment, theamino acid at position 235 is isoleucine (I)(SEQ ID NO:135). FIGS. 40B-1and 40B-2 (SEQ ID NO:135) shows the predicted amino acid sequencedencoded by this rat HKNG full length clone variant. The second of thenaturally occurring rat HKNG full length clone variants, which isreferred to herein as rHKNG1 T, also contained a single nucleotidesubstitution. In this embodiment of a nucleotide sequence of the ratHKNG full length clone variant, the nucleotide at position 816 is acytosine (C)(SEQ ID NO:136). The cDNA sequence of this rat HKNG fulllength clone variant is depicted in FIG. 41A (SEQ ID NO:136). In thisembodiment, the amino acid at position 235 is threonine (T)(SEQ IDNO:137). FIGS. 41B-1 and 41B-2 (SEQ ID NO:137) shows the predicted aminoacid sequenced encoded by this rat HKNG full length clone variant. Eachof the variants were confirmed by direct sequencing of RT-PCR productsfrom the rat retina polyA RNA used to obtain the clones and bysequencing PCR products derived from amplification of Sprague-Dawley ratgenomic DNA.

[0665] Additionally, while sequencing the above-identified multipleclones, a novel rat HKNG clone was isolated. This clone, whichcompletely lacks corresponding exon 9 of the full length HKNG1 cDNAsequence, is referred to herein as rHKNG1Δ9. Because the deletion ofexon 9 from the full length rHKNG1 sequence leads to an immediateframeshift, the clone rHKNG1Δ9 encodes a truncated form of the rHKNG1protein. The rHKNG1Δ9 cDNA sequence (SEQ ID NO:138) is depicted in FIG.42A and the predicted amino acid sequence (SEQ ID NO:139) of therHKNG1Δ9 gene product it encodes is depicted in FIG. 42B. Thus, the ratHKNGD9 isoform lacks the sequence that would be homologous to exon 9 inhuman HKNG. This isoform would cause truncation of the predicted peptideand add additional amino acids not found in full length rat HKNG.

20. EXAMPLE Localization of the TS Gene to Chromosome 18

[0666] In the example presented in this section, studies are describedthat, first, define an interval approximately 310 kb on the short arm ofhuman chromosome 18 within which a region associated with aneuropsychiatric disorder is located, and second, identify a known gene,TS which lies within this region and therefore, which is a candidategene for mediating neuropsychiatric disorders, including, withoutlimitation, BAD.

20.1. MATERIALS AND METHODS

[0667] BAC Mapping:

[0668] The STSs from the region were used to screen a human BAC library(Research Genetics, Huntsville, Ala.). The ends of the BACs were clonedor directly sequenced. The end sequences were used to amplify the nextoverlapping BACs. From each BAC addition microsatellites wereidentified. Standard short tag sequence (STS) content mapping wasperformed with microsatellite markers and non-polymorphic STSs availablefrom databases that surround the genetically defmed candidate region toorder the markers on the physical map. Random sheared libraries wereprepared from overlapping BACs within the defmed genetic interval. BACDNA was sheared with a nebulizer (CIS-US inc. Bedford, Mass.). Fragmentsin the size range of 600-1000 base pairs were utilized for thesublibrary microsatellite probes. Sequences around such repeats wereobtained to enable development of PCR primers for genomic DNA.

[0669] Mapping of Known Genes to the High Resolution Physical Map:

[0670] There are many known genes reported to be located on thechromosome 18 short arm telomere region; STS markers derived from thesegenes were either available in public database (TS) or were designed foreach of these genes and STS-content mapping was performed as done withother microsatellite markers and non-polymorphic STSs. Additional knowngenes (centric and photoreceptor) were identified by sequencing ofrandom clones from BACs in the interval, which contained a portion ofthe known gene.

[0671] Sample Sequencing:

[0672] Random sheared libraries were made from all the BACs within thedefmed genetic region. Approximately 9,000 subclones within theapproximately 310 kb region were sequenced with vector primers in orderto achieve an 8-fold sequence coverage of the region. All sequences wereprocess through an automated sequence analysis pipeline that assessedquality, removed vector sequences and masked repetitive sequences. Theresulting sequences were then compared to public DNA and proteindatabases using the BLAST algorithms (Altschul et al., 1990 J. Mol.Biol., 215:403-410).

[0673] High resolution physical map of the 18p telomere candidate regionwas developed using BAC and RH techniques.

[0674] BAD genes have been reported to map to 18q and 18p including abroad undefined region flanking marker D18S59. For such physicalmapping, the region from publicly available markers SHGC11249 andD18S481, which spans the most telomeric region of chromosome 18 ofapproximately 5 Mb was mapped and contiged with BACs.

[0675] TS encodes thymidylate synthase. Thymidylate synthase catalyzesthe transfer of a methyl group to deoxyuridine-5-prime-monophosphate toform thymidine-5-prime-monophosphate (TMP). It is important to the denovo production of TMP for DNA synthesis. Thymidylate synthase has beenof considerable interest as a target for cancer chemotherapeutic agents.Takeishi et al. (1989) isolated phage clones covering the functionallyactive TS gene and described its genomic structure. By nonisotopic insitu hybridization, Hori et al. (1990) defmed the location of the geneto 18p11.32. By the STS-contenting mapping described above, the TS genewas mapped precisely to the middle of the 310 kb interval.

[0676] Thymidylate synthase (TS) is a key enzyme in DNA replication,because it catalyzes the only de novo pathway of dTTP and plays anessential role in regulating a balanced supply of the four DNAprecursors for maintaining a normal rate of DNA synthesis at a defmedstage of the cell division cycle. Various studies have indicated thatthymidylate stress conditions, in which thymidylate synthase activity islimited, perturb the levels of deoxynucleoside triphosphate pools andresult in various genetic instabilities, such as mutation, geneticrecombination, DNA fragmentation, chromosome aberration and sisterchromatid exchange (Ayusawa et al., 1983; Meuth 1984; Hor et al. 1984a,b; Seno et al. 1985). In addition, both low and high thymidylate stressconditions induce the expression of fragile sites on human chromnosomes(Sutherland and Hecht 1985; Hori et al. 1988). Since thymidylatesynthase is known to be a component of a multienzyme complex, with otherenzymes such as DNA polymerase, ribonucleotide reductase, thymidinekinase and dihydrogolate reductase (Reddy and Pardee, 1980), it isimportant to determine the organization and chromosomal locations of thegenes encoding these functionally related enzymes.

[0677] Thymidylate synthase is one of the members of a multienzymecomplex known as “replitase” (Reddy and Pardee 1980). The assembly ofDNA precursor-synthesizing enzymes with a DNA replication apparatusseems to facilitate the most efficient supply of DNA precursors. Thefollowing seven housekeeping genes, encoding enzymes involved in DNAbiosynthesis, have been mapped on human chromosomes (Human gene Mapping10 1989); DNA polymers alpha (POLA) at Xp22.1-p21.3, DNA polymerase beta(POLB) at 8p12-p11, thymidine kinase TK) at 17q23.3-q25.3, dihydrofolatereductase (DHFR) at 5q11.2-q13.2, ribonucleotide reductase MA peptide(RRM1) at 11p15.5-p15.4, ribonucleotide reductase M2 peptide (RRM2) at2p25-2p24 and TS at 18p11.32). Thus, there seems to be no obligatoryclustering of the housekeeping genes involved in DNA metabolism. It hasbeen demonstrated that the expression of the TS gene, like that of otherhousekeeping genes, is regulated at a post-transcriptional level(Ayusawa et al. 1986).

20.2. RESULTS

[0678] In respect of the chromosome mapping of the gene encodingthymidylate synthase, two provisional assignments to chromosome 18 havebeen reported. Hori et al. (1985) mapped the TS gene to chromosome 18,by assaying the enzyme activity in somatic cell hybrids prepared byfusing a line of thymidylate synthase-negative mouse mutant FM3A cellsand human diploid fibroblasts from a male patient with the fragile Xsyndrome. Furthermore, the analysis of one hybrid clone with a deletionof chromosome 18 suggested that the gene was located in the region of18pter-q12. The TS gene was also mapped to the same chromosome by thecomplementation of thymidine-auxotrophy of Chinese hamster V79 mutantcells and Southern blot analysis of a panel of human-hamster cellhybrids with a mouse of cDNA probe (Nussbaum et al. 1985). Thequantitative Southern blot analysis of such unbalanced human cell linesfurther localized the gene to 18q21-qter. These two chromosomal regionsassigned for the location of the TS gene do not overlap (Human GeneMapping 10 1989). In an attempt to resolve this discrepancy and define amore precise location for the gene, nonisotopic in situ hybridizationexperiments were performed by Hori et al. (Human Genetics 85:576-580(1990)) by using biotinylated cDNA and genomic DNA probes of the humanTS gene.

[0679] The precise location of the TS gene to the telomeric region ofchromosome 18 makes the gene potentially useful for the construction ofboth physical and genetic linkage maps of this chromosome. A preliminarygenetic linkage map of chromosome 18, consisting of twelve loci, hasalready been reported (O'Connell et al. 1988). However, the actualcoverage of chromosome 18 by this map is incomplete, because of the lackof telomeric DNA markers. The TS gene thus provides a useful telomericanchor point on the short arm of chromosome 18 for further investigationof the linkage map. The TS gene can also be used for the analysis ofclinical disorders associated with anomalies of chromosome 18, such asthe tetrasomy 18p syndrome described above. Furthermore, it can be usedfor linkage studies with genetic disorders mapped on chromosome 18, suchas multiple hereditary cutaneous leimyomata (McKusick 1986), sincehighly polymorphic alleles can be detected at the TS locus in Japanesepopulations (H. Akazawa, D. Ayusawa, S. Kaneda, K. Shimizu, K. Takeishi,T. Seno, manuscript in preparation).

21. EXAMPLE Fine-Scale Mapping of a Locus for Severe Bipolar MoodDisorder on Chromosome 18P11.3 in the Costa Rican Population

[0680] In the example presented in this Section, studies are describedfor searching for genes predisposing individuals to bipolar disorder bystudying individuals with the most extreme form of the affectedphenotype, BP-1, ascertained from the genetically isolated population ofthe Central Valley of Costa Rica (CVCR)(McInnes, L. A. et al. Fine-scalemapping of a locus for severe bipolar mood disorder on chromosome18p11.3 in the Costa Rican population. Manuscript submitted forpublication to Nature Genetics, the entire text of which is incorporatedby reference herein in its entirety). Linkage analysis was performed ontwo extended CVCR BP-I pedigrees (CR001 and CR004)(Mclnnes, L. A. et al.PNAS 93, 13060-13065 (1996)) and linkage disequilibrium (LD) analyses ofa population-based sample characterized by an even more extremephenotype defined as BP-I with at least two psychiatric hospitalizations(Escamilla, M. et al. Am. J. Hum. Genet. 64, 1670-1678 (1999)). Resultsfrom both of these approaches implicated markers in the same region on18p11.3. This region was further investigated for evidence of a BPsusceptibility locus by creating a physical map and developing a largenumber of microsatellite and single nucleotide polymorphism (SNP)markers for typing in the pedigree and population samples. This examplesummarizes the results of fine-scale association analyses in thepopulation sample, as well as the haplotype data generated for the BP-Ipatients in CR001. The results suggest a candidate region containing sixgenes.

21.1. MATERIALS AND METHODS

[0681] Sample Collection:

[0682] Details regarding the composition, ascertainment and diagnosticprocedures for the population sample analyzed in this paper can be foundin Escamilla, M. et al. Am. J. Hum. Genet. 64, 1670-1678 (1999), andEscamilla et al. manuscript in submission). Details regarding therecruitment and composition of the control sample can be found inEscamilla et al. manuscript in submission.

[0683] Radiation Hybrid and STS-Content Mapping of Markers Within theCandidate Interval:

[0684] Genetic and physical mapping information was initially obtainedfrom various online sources, such as Whitehead Institute for BiomedicalResearch/MIT Center for Genome Research (http://www-genome.wi.mit.edu),Stanford Human Genome Center (http://www-shgc.stanford.edu), GÉNÉTHONHuman Genome-Research Center (http://www.genethon.fr/genethon_en.html),and the Cooperative Human Linkage Center (http://lpg.nci.nih.gov/CHLC).Radiation hybrid (RH) mapping (Cox, D. R. et al. Science 250, 245-250(1990)) was used extensively in the early phase of this study to resolvediscrepancies in marker order between maps. Specifically, the 83Stanford G3 radiation hybrid panel was used to map all genetic and STSmarkers available from public database as well as those developedspecifically for the project. In addition to RH mapping, STS-contentmapping using BAC (Bacterial Artificial Chromosome) clones from theregion of interest was also used routinely to determine the marker orderand to complete the BAC contig.

[0685] BAC Library Screening, End Sequencing and Contig Building:

[0686] Microsatellite and STS markers obtained from public database wereused to screen the human BAC library from Research Genetics (Huntsville,Ala.) by PCR or to the BAC library from Genome systems (St. Louis, Mo.)screen by hybridization according to manufacturers' protocols. BAC DNAfrom positive clones was prepared using Qiagen tip 2500 columnsfollowing Qiagen Mega Prep protocol (Qiagen, Valencia, Calif.) withminor modifications. Sequences of the BAC ends were obtained by cyclesequencing the BAC DNA directly with vector primers T7 and SP6,respectively. Reactions were analyzed on an ABI 377 DNA sequencer (PEBiosystems, Foster City, Calif.). PCR primers were designed fromnon-repetitive end sequences and used as STS markers to improve thephysical map and the BAC contig construction. The outlying markers fromeach side of the contigs were used to screen for overlapping BAC clonesto extend the contigs.

[0687] Construction of Randomly Sheared Libraries From BACs:

[0688] BAC DNA was sheared to small fragments of desired size rangeusing nebulizer (CIS-US, Inc., Bedford, Mass.) in a buffer containing50-100 mg DNA, 25% glycerol; 55 mM Tris and 15 mM MgCl₂. The mixture wasadded to Nebulizer and gas pressure was determined by condition workedout on comparable salmon sperm DNA in a pilot experiment. Aftershearing, the libraries were constructed as previously described(Pulido, J. C. & Duyk, G. M. In “Current Protocols in Human Genetics.”Unit 2.2, Greene Publishing and Wiley, New York (1994)).

[0689] Microsatellite and SNP Marker Development:

[0690] Microsatellite markers were generated by hybridization ofoligonucleotide probes for di, tri, and tetranucleotide repeats torandomly sheared sublibraries made from BAC clones using Quicklitenon-isotopic enzyme induced chemiluminescent reagents from LifecodesCorp. (Stamford Conn.) following the manufacturer's instructions.Positive clones were sequenced to identify the microsatellite sequences.Primer sets were then designed from flanking unique DNA sequence.Primers for STS markers were also designed using BAC end sequences, andrandom sequences available within the candidate interval when extensivesequencing of the randomly sheared libraries were done.

[0691] SSCP (Single Strand Conformational Polymorphism) Analysis:

[0692] 2.5 ml of PCR product was mixed with 4 ml of blue dye (95%formamide, 20 mM EDTA, 0.05% Bromophenol Blue and 0.05% Xylene cyanolFF), denatured at 100° C. for 10 min and immediately chilled on ice. 2.5ml was run on a 6% SSCP gel in 0.5× TBE buffer in the gel apparatus(Life Technologies, Inc., Rockville, Md.) for about 16 hrs at 4° C. Thegel was stained with SYBR green I nucleic acid and SYBR Green II RNA gelstain (Molecular Probes, Eugene, Oreg.) and visualized using thefluorimager 575 (Amersham, Piscataway, N.J.). When shifted bands wereobserved, the nucleotide basis for the polymorphism was determined bydirectly sequencing the PCR product.

[0693] Sequencing of the Candidate Interval and Identification of theCandidate Genes:

[0694] When the candidate interval was sufficiently narrowed toapproximately 0.5 Mb, randomly sheared libraries prepared from BACscovering this region were sequenced at 10× coverage to discover allsequence information and identify all genes within the interval. Morethan 10,000 individual sequences from the region were compared byBLAST20 with sequences from publicly available databases and wereanalyzed using GRAIL21 to identify potential coding sequences. Inaddition, sequences were assembled using PHRAP 22, 23, 24 in a singleDNA strand of ˜340 kb. The whole sequence was again analyzed using BLASTand GRAIL to aid in gene prediction. These data were displayed in ACEdb(data available from ncbi.nlm.nih.gov) to visualize predicted exons andtheir relationships to each other.

[0695] Genotyping of Microsatellites:

[0696] The following publicly available markers were genotyped in thecandidate region on 18p11.3. SAVA5 from the Donnis-Keller laboratory,D18S1140, D18S59, D18S1105, D18S476 from Genethon, GATA166DO5 from theCooperative Human Linkage Center and PACAP designed from known sequencedata of this gene by this group. Genotyping procedures for themicrosatellites were performed as previously described in Bull, L. N. etal. (Hum. Genet. 104, 241-248 (1999)). In brief, one of the two primerswas labeled radioactively with a polynucleotide kinase, and PCR productswere separated, by electrophoresis, onto polyacrylamide gels.Autoradiographs were scored independently by two raters withoutknowledge of affection status of the samples. Data for each marker wereentered into the computer database twice, and the resultant files werecompared for discrepancies and non-mendelian errors.

[0697] Statistical Analyses:

[0698] A modified version of Terwilliger's likelihood-ratio test of LD(Terwilliger, J. D. Am. J. Hum. Genet. 56, 777-778 (1995)) was appliedto the 10 microsatellites and 26 single nucleotide polymorphisms (SNPS)that spanned the 300 kb candidate region. For each of these 36 markersthis test was applied twice, once in the sample of 227 patients andtheir available relatives (N=563), and also with the addition of theindependent control trios to the 227 patients and relatives (N=641).This likelihood-ratio test estimates a single parameter, lambda, whichquantifies potential over representation of marker alleles on diseasechromosomes versus control chromosomes. Through simulations Terwilligershows that this test is conservative. A modified version of theprocedure of Terwilliger as described in a previous LD paper (Escamilla,M. et al. Am. J. Hum. Genet. 64, 1670-1678 (1999)) was used in order toincorporate data from additional family members other than parents ifthey were not available. The same genetic model of disease transmission(mostly dominant with reduced penetrance) was used as in the previous LDpapers (Escamilla, M. et al. 18. Am. J. Hum. Genet. 64, 1670-1678 (1999)and Escamilla et al. in submission) and in the genome screen of theCosta Rican pedigrees described in McInnes et al. (McInnes, L. A. et al.PNAS 93, 13060-13065 (1996)). The use of a model is likely to increasethe power of the test and the precision of the estimates of lambda whenthe inheritance pattern is approximately known (Terwilliger, J. D. Am.J. Hum. Genet. 56, 777-778 (1995)).

21.2. RESULTS

[0699] In a previous LD study of chromosome 18 in a population sample ofBP-I patients from the CVCR (Escamilla, M. et al. Am. J. Hum. Genet. 64,1670-1678 (1999)), the highest level of evidence for association wasobtained at marker D18S59 in 18p11.3. A flanking marker, D18S476, alsogave a moderately positive signal. Interestingly, the associated alleleat D18S59 in the population sample also provided the second highestevidence for linkage of 473 markers used in a previous genome-widescreen of Costa Rican pedigree CR001 (McInnes, L. A. et al. PNAS 93,13060-13065 (1996)); the allele at D18S476 carried by BP-I patients inCR001 was also the same as the associated allele in the populationsample. Fine mapping of a BP-I susceptibility locus in this region wasinitiated by choosing publicly available markers from various databasesand ordering them using radiation hybrid and STS mapping strategies (seemethods described above). Markers typed in the interval between D18S59and D18S476 in the original population sample and the pedigree CR001suggested that the maximal region of identity-by-descent (IBD) sharingamong these individuals appeared to be between D18S59 and PACAP. Markerdevelopment and physical mapping efforts were thus focused in the regionbetween SAVA5 (the most telomeric marker to D18S59) and PACAP. Duringconstruction of the physical map 4 novel microsatellite markers and 26new SNPs were discovered. These markers were genotyped in a largersample of 227 CVCR BP-I patients (including the original set of 69) withavailable first degree relatives, in the previously studied individualsfrom pedigree CR001, and in a sample of controls recruited from theUniversity of Costa Rica who met the same requirements for CVCR ancestryas did the BP-I patients in the population sample. LD was performedanalysis using the likelihood test proposed by Terwilliger (Terwilliger,J. D. Am. J. Hum. Genet. 56, 777-778 (1995); the results for all markersin the population sample, with and without controls, are displayed inTable 15 (only six of the new SNPs, PH33, PH84, PH205, PH202, PH208,TS16 and TS30, are depicted in Table 15 below). Primers used to obtainthe sequences of the SNPs for each of PH33, PH84, PH205, PH202, PH208,TS16 and TS30 are shown in Table 16. FIGS. 47A-C display the markerswhere the associated alleles in the population sample are shared IBDbetween the patients in CR001.

[0700] Table 15. Column 227 lambda indicate the lambda value for the 227patients analyzed with relatives. Column 227+ includes patients, theirrelatives and controls. Columns to the right of the table indicate themarkers where alleles are shared identically by descent with BP-Ipatients from CR001. Group A indicates haplotypes shared by CR001 IDnumbers 4020, 6001 and 5061. Group B includes CR001 ID numbers 4226 and5271. Group C includes ID numbers 5025 and 5036. Of note, all 8 of thepredominantly phase known or reconstructed BP-I individuals from CR001also shared haplotypes surrounding this region of at least 5 cM withintheir group. 227 227 + CR001 CR001 CR001 Marker Lambda Chisq Pval LambdaChisq Pval Group A Group B Group C PH33 0.00 0.66 2.81 0.047 PH84 0.9010.29 0.0007 0.78 4.40 0.018 X X X PH205 1.00 3.98 0.023 1.00 7.14 0.004X X X PH202 0.99 2.26 0.066 1.00 9.03 0.001 X X X PH208 0.96 2.20 0.0691.00 5.96 0.007 X TS16 0.00 0.84 4.78 0.014 X TS30 0.00 0.88 7.31 0.003X

[0701] TABLE 16 Family Haplotype Data Allele Associated with the diseaseMarker Primer Sequences Polymorphism haplotype PH33 Forward: SNP 2GAGAACCGCTTTATTCCCAGG Reverse: CTTTTCTCTAACCTCCTAGCAG PH84 Forward: SNP1 GGGACCATATGTACATGTATGC Reverse: CTGCAATGCATTAATTTGCACAATG PH205Forward: SNP 2 AGATTGCCCTTGGAGCACTTAG Reverse: GCTCTCAGGTGCAACTTTTAAGPH202 Forward: SNP 2 AGAAACGGGTCAGGTCTAGAG Reverse:TCTAGAGGTAGACACACATGTC PH208 Forward: SNP GTTACTGAGTCATCAACAGATCTReverse: TGAACGTTCATAAAGAGTCACATG TS16 Forward: SNPTCACAGTGTCCTTTTGTGACTG Reverse: GTGTTTTCCATAAAATACGTATGTC TS30 Forward:SNP GCACCTACTGGTATAAATGCAC Reverse: TTCTTCATAGAACTGATATTCTGG

22. REFERENCES CITED

[0702] The present invention is not to be limited in scope by thespecific embodiments described herein, which are intended as singleillustrations of individual aspects of the invention, and functionallyequivalent methods and components are within the scope of the invention.Indeed, various modifications of the invention, in addition to thoseshown and described herein will become apparent to those skilled in theart from the foregoing description and accompanying drawings.

[0703] The discussion or citation of a reference herein shall not beconstrued as an admission that such reference is prior art to thepresent invention. All publications, patents, and patent applicationsmentioned in this specification are herein incorporated by reference tothe same extent as if each individual publication or patent applicationwas specifically and individually indicated to be incorporated byreference.

1 165 1 2055 DNA Homo sapiens CDS (285)...(1769) 1 tgcgtcacct gcaggcccgggccgcggggt tggtttccac cctggaggtt gctgacaccc 60 tgtgccctcg gctgacttccagccggtggc acagacgcct ccagggggca gcactcaagc 120 gcatcttagg aatgacagagttgcgtccct ctctgttgcc aggctggagt tcagtggcat 180 gttcttagct cactgaagcctcaaattcct gggttcaagt gaccctccca cctcagcccc 240 atgaggacct gggactacaggacacagcta aatccctgac acgg atg aaa att aaa 296 Met Lys Ile Lys 1 gca gagaaa aac gaa ggt cct tcc aga agc tgg tgg caa ctt cac tgg 344 Ala Glu LysAsn Glu Gly Pro Ser Arg Ser Trp Trp Gln Leu His Trp 5 10 15 20 gga gatatt gca aat aac agc ggg aac atg aag ccg cca ctc ttg gtg 392 Gly Asp IleAla Asn Asn Ser Gly Asn Met Lys Pro Pro Leu Leu Val 25 30 35 ttt att gtgtgt ctg ctg tgg ttg aaa gac agt cac tgc gca ccc act 440 Phe Ile Val CysLeu Leu Trp Leu Lys Asp Ser His Cys Ala Pro Thr 40 45 50 tgg aag gac aaaact gct atc agt gaa aac ctg aag agt ttt tct gag 488 Trp Lys Asp Lys ThrAla Ile Ser Glu Asn Leu Lys Ser Phe Ser Glu 55 60 65 gtg ggg gag ata gatgca gat gaa gag gtg aag aag gct ttg act ggt 536 Val Gly Glu Ile Asp AlaAsp Glu Glu Val Lys Lys Ala Leu Thr Gly 70 75 80 att aag caa atg aaa atcatg atg gaa aga aaa gag aag gaa cac acc 584 Ile Lys Gln Met Lys Ile MetMet Glu Arg Lys Glu Lys Glu His Thr 85 90 95 100 aat cta atg agc acc ctgaag aaa tgc aga gaa gaa aag cag gag gcc 632 Asn Leu Met Ser Thr Leu LysLys Cys Arg Glu Glu Lys Gln Glu Ala 105 110 115 ctg aaa ctt ctg aat gaagtt caa gaa cat ctg gag gaa gaa gaa agg 680 Leu Lys Leu Leu Asn Glu ValGln Glu His Leu Glu Glu Glu Glu Arg 120 125 130 cta tgc cgg gag tct ttggca gat tcc tgg ggt gaa tgc agg tct tgc 728 Leu Cys Arg Glu Ser Leu AlaAsp Ser Trp Gly Glu Cys Arg Ser Cys 135 140 145 ctg gaa aat aac tgc atgaga att tat aca acc tgc caa cct agc tgg 776 Leu Glu Asn Asn Cys Met ArgIle Tyr Thr Thr Cys Gln Pro Ser Trp 150 155 160 tcc tct gtg aaa aat aagatt gaa cgg ttt ttc agg aag ata tat caa 824 Ser Ser Val Lys Asn Lys IleGlu Arg Phe Phe Arg Lys Ile Tyr Gln 165 170 175 180 ttt cta ttt cct ttccat gaa gat aat gaa aaa gat ctc ccc atc agt 872 Phe Leu Phe Pro Phe HisGlu Asp Asn Glu Lys Asp Leu Pro Ile Ser 185 190 195 gaa aag ctc att gaggaa gat gca caa ttg acc caa atg gag gat gtg 920 Glu Lys Leu Ile Glu GluAsp Ala Gln Leu Thr Gln Met Glu Asp Val 200 205 210 ttc agc cag ttg actgtg gat gtg aat tct ctc ttt aac agg agt ttt 968 Phe Ser Gln Leu Thr ValAsp Val Asn Ser Leu Phe Asn Arg Ser Phe 215 220 225 aac gtc ttc aga cagatg cag caa gag ttt gac cag act ttt caa tca 1016 Asn Val Phe Arg Gln MetGln Gln Glu Phe Asp Gln Thr Phe Gln Ser 230 235 240 cat ttc ata tca gataca gac cta act gag cct tac ttt ttt cca gct 1064 His Phe Ile Ser Asp ThrAsp Leu Thr Glu Pro Tyr Phe Phe Pro Ala 245 250 255 260 ttc tct aaa gagccg atg aca aaa gca gat ctt gag caa tgt tgg gac 1112 Phe Ser Lys Glu ProMet Thr Lys Ala Asp Leu Glu Gln Cys Trp Asp 265 270 275 att ccc aac ttcttc cag ctg ttt tgt aat ttc agt gtc tct att tat 1160 Ile Pro Asn Phe PheGln Leu Phe Cys Asn Phe Ser Val Ser Ile Tyr 280 285 290 gaa agt gtc agtgaa aca att act aag atg ctg aag gca ata gaa gat 1208 Glu Ser Val Ser GluThr Ile Thr Lys Met Leu Lys Ala Ile Glu Asp 295 300 305 tta cca aaa caagac aaa gct cct gac cac gga ggc ctg att tca aag 1256 Leu Pro Lys Gln AspLys Ala Pro Asp His Gly Gly Leu Ile Ser Lys 310 315 320 atg tta cct gggcag gac aga gga ctg tgt ggg gaa ctt gac cag aat 1304 Met Leu Pro Gly GlnAsp Arg Gly Leu Cys Gly Glu Leu Asp Gln Asn 325 330 335 340 ttg tca agatgt ttc aaa ttt cat gaa aaa tgc caa aaa tgt cag gct 1352 Leu Ser Arg CysPhe Lys Phe His Glu Lys Cys Gln Lys Cys Gln Ala 345 350 355 cac cta tctgaa gac tgt cct gat gta cct gct ctg cac aca gaa tta 1400 His Leu Ser GluAsp Cys Pro Asp Val Pro Ala Leu His Thr Glu Leu 360 365 370 gac gag gcgatc agg ttg gtc aat gta tcc aat cag cag tat ggc cag 1448 Asp Glu Ala IleArg Leu Val Asn Val Ser Asn Gln Gln Tyr Gly Gln 375 380 385 att ctc cagatg acc cgg aag cac ttg gag gac acc gcc tat ctg gtg 1496 Ile Leu Gln MetThr Arg Lys His Leu Glu Asp Thr Ala Tyr Leu Val 390 395 400 gag aag atgaga ggg caa ttt ggc tgg gtg tct gaa ctg gca aac cag 1544 Glu Lys Met ArgGly Gln Phe Gly Trp Val Ser Glu Leu Ala Asn Gln 405 410 415 420 gcc ccagaa aca gag atc atc ttt aat tca ata cag gta gtt cca agg 1592 Ala Pro GluThr Glu Ile Ile Phe Asn Ser Ile Gln Val Val Pro Arg 425 430 435 att catgaa gga aat att tcc aaa caa gat gaa aca atg atg aca gac 1640 Ile His GluGly Asn Ile Ser Lys Gln Asp Glu Thr Met Met Thr Asp 440 445 450 tta agcatt ctg cct tcc tct aat ttc aca ctc aag atc cct ctt gaa 1688 Leu Ser IleLeu Pro Ser Ser Asn Phe Thr Leu Lys Ile Pro Leu Glu 455 460 465 gaa agtgct gag agt tct aac ttc att ggc tac gta gtg gca aaa gct 1736 Glu Ser AlaGlu Ser Ser Asn Phe Ile Gly Tyr Val Val Ala Lys Ala 470 475 480 cta cagcat ttt aag gaa cat ttt aaa acc tgg taagaagatc taatgcatcc 1789 Leu GlnHis Phe Lys Glu His Phe Lys Thr Trp 485 490 495 tatatccagt aagtagaattatctcttcat ctgggacctg gaaatcctga aataaaaaag 1849 gataatgcaa taaacacagttgcaggaaag tatgttagct atatactatg aagtactctt 1909 agtttactta tgttgaatggcttagctatt aatactcaaa ttgagttaaa atgaaaattc 1969 ctccttaaaa aatcaaacgtaatatgtatt acatttcatg gtacattagt agttctttgt 2029 atattgaata aatactaaatcaccta 2055 2 495 PRT Homo sapiens 2 Met Lys Ile Lys Ala Glu Lys Asn GluGly Pro Ser Arg Ser Trp Trp 1 5 10 15 Gln Leu His Trp Gly Asp Ile AlaAsn Asn Ser Gly Asn Met Lys Pro 20 25 30 Pro Leu Leu Val Phe Ile Val CysLeu Leu Trp Leu Lys Asp Ser His 35 40 45 Cys Ala Pro Thr Trp Lys Asp LysThr Ala Ile Ser Glu Asn Leu Lys 50 55 60 Ser Phe Ser Glu Val Gly Glu IleAsp Ala Asp Glu Glu Val Lys Lys 65 70 75 80 Ala Leu Thr Gly Ile Lys GlnMet Lys Ile Met Met Glu Arg Lys Glu 85 90 95 Lys Glu His Thr Asn Leu MetSer Thr Leu Lys Lys Cys Arg Glu Glu 100 105 110 Lys Gln Glu Ala Leu LysLeu Leu Asn Glu Val Gln Glu His Leu Glu 115 120 125 Glu Glu Glu Arg LeuCys Arg Glu Ser Leu Ala Asp Ser Trp Gly Glu 130 135 140 Cys Arg Ser CysLeu Glu Asn Asn Cys Met Arg Ile Tyr Thr Thr Cys 145 150 155 160 Gln ProSer Trp Ser Ser Val Lys Asn Lys Ile Glu Arg Phe Phe Arg 165 170 175 LysIle Tyr Gln Phe Leu Phe Pro Phe His Glu Asp Asn Glu Lys Asp 180 185 190Leu Pro Ile Ser Glu Lys Leu Ile Glu Glu Asp Ala Gln Leu Thr Gln 195 200205 Met Glu Asp Val Phe Ser Gln Leu Thr Val Asp Val Asn Ser Leu Phe 210215 220 Asn Arg Ser Phe Asn Val Phe Arg Gln Met Gln Gln Glu Phe Asp Gln225 230 235 240 Thr Phe Gln Ser His Phe Ile Ser Asp Thr Asp Leu Thr GluPro Tyr 245 250 255 Phe Phe Pro Ala Phe Ser Lys Glu Pro Met Thr Lys AlaAsp Leu Glu 260 265 270 Gln Cys Trp Asp Ile Pro Asn Phe Phe Gln Leu PheCys Asn Phe Ser 275 280 285 Val Ser Ile Tyr Glu Ser Val Ser Glu Thr IleThr Lys Met Leu Lys 290 295 300 Ala Ile Glu Asp Leu Pro Lys Gln Asp LysAla Pro Asp His Gly Gly 305 310 315 320 Leu Ile Ser Lys Met Leu Pro GlyGln Asp Arg Gly Leu Cys Gly Glu 325 330 335 Leu Asp Gln Asn Leu Ser ArgCys Phe Lys Phe His Glu Lys Cys Gln 340 345 350 Lys Cys Gln Ala His LeuSer Glu Asp Cys Pro Asp Val Pro Ala Leu 355 360 365 His Thr Glu Leu AspGlu Ala Ile Arg Leu Val Asn Val Ser Asn Gln 370 375 380 Gln Tyr Gly GlnIle Leu Gln Met Thr Arg Lys His Leu Glu Asp Thr 385 390 395 400 Ala TyrLeu Val Glu Lys Met Arg Gly Gln Phe Gly Trp Val Ser Glu 405 410 415 LeuAla Asn Gln Ala Pro Glu Thr Glu Ile Ile Phe Asn Ser Ile Gln 420 425 430Val Val Pro Arg Ile His Glu Gly Asn Ile Ser Lys Gln Asp Glu Thr 435 440445 Met Met Thr Asp Leu Ser Ile Leu Pro Ser Ser Asn Phe Thr Leu Lys 450455 460 Ile Pro Leu Glu Glu Ser Ala Glu Ser Ser Asn Phe Ile Gly Tyr Val465 470 475 480 Val Ala Lys Ala Leu Gln His Phe Lys Glu His Phe Lys ThrTrp 485 490 495 3 1957 DNA Homo sapiens CDS (241)...(1671) 3 tgcgtcacctgcaggcccgg gccgcggggt tggtttccac cctggaggtt gctgacaccc 60 tgtgccctcggctgacttcc agccggtggc acagacgcct ccagggggca gcactcaagc 120 gcatcttaggaatgacagag ttgcgtccct ctcggttgcc aggctggagt tcagtggcat 180 gttcatagctcactgaagcc tcaaattcct gggttcaagt gaccctccta cctcagcccc 240 atg agg acctgg gac tac agt aac agc ggg aac atg aag ccg cca ctc 288 Met Arg Thr TrpAsp Tyr Ser Asn Ser Gly Asn Met Lys Pro Pro Leu 1 5 10 15 ttg gtg tttatt gtg tgt ctg ctg tgg ttg aaa gac agt cac tcc gca 336 Leu Val Phe IleVal Cys Leu Leu Trp Leu Lys Asp Ser His Ser Ala 20 25 30 ccc act tgg aaggac aaa agt gct atc agt gaa aac ctg aag agt ttt 384 Pro Thr Trp Lys AspLys Ser Ala Ile Ser Glu Asn Leu Lys Ser Phe 35 40 45 tct gag gtg ggg gagata gat gca gat gaa gag gtg aag aag gct ttg 432 Ser Glu Val Gly Glu IleAsp Ala Asp Glu Glu Val Lys Lys Ala Leu 50 55 60 act ggt att aag caa atgaaa atc atg atg gaa aga aaa gag aag gca 480 Thr Gly Ile Lys Gln Met LysIle Met Met Glu Arg Lys Glu Lys Ala 65 70 75 80 aac cag gcc cca gaa acagag atc atc ttt aat tca ata cag gta gtt 528 Asn Gln Ala Pro Glu Thr GluIle Ile Phe Asn Ser Ile Gln Val Val 85 90 95 cca agg att gaa cac acc aatcta atg agc acc ctg aag aaa tgc aga 576 Pro Arg Ile Glu His Thr Asn LeuMet Ser Thr Leu Lys Lys Cys Arg 100 105 110 gaa gaa aag cag gag gcc ctgaaa ctt ctg aat gaa gtt caa gaa cat 624 Glu Glu Lys Gln Glu Ala Leu LysLeu Leu Asn Glu Val Gln Glu His 115 120 125 ctg gag gaa gaa gaa agg ctatgc cgg gag tct ttg gca gat tcc tgg 672 Leu Glu Glu Glu Glu Arg Leu CysArg Glu Ser Leu Ala Asp Ser Trp 130 135 140 ggt gaa tgc agg tct tgc ctggaa aat aac tgc atg aga att tat aca 720 Gly Glu Cys Arg Ser Cys Leu GluAsn Asn Cys Met Arg Ile Tyr Thr 145 150 155 160 acc tgc caa cct agc tggtcc tct gtg aaa aat aag att gaa cgg ttt 768 Thr Cys Gln Pro Ser Trp SerSer Val Lys Asn Lys Ile Glu Arg Phe 165 170 175 ttc agg aag ata tat caattt cta ttt cct ttc cat gaa gat aat gaa 816 Phe Arg Lys Ile Tyr Gln PheLeu Phe Pro Phe His Glu Asp Asn Glu 180 185 190 aaa gat ctc ccc atc agtgaa aag ctc att gag gaa gat gca caa ttg 864 Lys Asp Leu Pro Ile Ser GluLys Leu Ile Glu Glu Asp Ala Gln Leu 195 200 205 acc caa atg gag gat gtgttc agc cag ttg act gtg gat gtg aat tct 912 Thr Gln Met Glu Asp Val PheSer Gln Leu Thr Val Asp Val Asn Ser 210 215 220 ctc ttt aac agg agt tttaac gtc ttc aga cag atg cag caa gag ttt 960 Leu Phe Asn Arg Ser Phe AsnVal Phe Arg Gln Met Gln Gln Glu Phe 225 230 235 240 gac cag act ttt caatca cat ttc ata tca gat aca gac cta act gag 1008 Asp Gln Thr Phe Gln SerHis Phe Ile Ser Asp Thr Asp Leu Thr Glu 245 250 255 cct tac ttt ttt ccagct ttc tct aaa gag ccg atg aca aaa gca gat 1056 Pro Tyr Phe Phe Pro AlaPhe Ser Lys Glu Pro Met Thr Lys Ala Asp 260 265 270 ctt gag caa tgt tgggac att ccc aac ttc ttc cag ctg ttt tgt aat 1104 Leu Glu Gln Cys Trp AspIle Pro Asn Phe Phe Gln Leu Phe Cys Asn 275 280 285 ttc agt gtc tct atttat gaa agt gtc agt gaa aca att act aag atg 1152 Phe Ser Val Ser Ile TyrGlu Ser Val Ser Glu Thr Ile Thr Lys Met 290 295 300 ctg aag gca ata gaagat tta cca aaa caa gac aaa gct cct gac cac 1200 Leu Lys Ala Ile Glu AspLeu Pro Lys Gln Asp Lys Ala Pro Asp His 305 310 315 320 gga ggc ctg atttca aag atg tta cct ggg cag gac aga gga ctg tgt 1248 Gly Gly Leu Ile SerLys Met Leu Pro Gly Gln Asp Arg Gly Leu Cys 325 330 335 ggg gaa ctt gaccag aat ttg tca aga tgt ttc aaa ttt cat gaa aaa 1296 Gly Glu Leu Asp GlnAsn Leu Ser Arg Cys Phe Lys Phe His Glu Lys 340 345 350 tgc caa aaa tgtcag gct cac cta tct gaa gac tgt cct gat gta cct 1344 Cys Gln Lys Cys GlnAla His Leu Ser Glu Asp Cys Pro Asp Val Pro 355 360 365 gct ctg cac acagaa tta gac gag gcg atc agg ttg gtc aat gta tcc 1392 Ala Leu His Thr GluLeu Asp Glu Ala Ile Arg Leu Val Asn Val Ser 370 375 380 aat cag cag tatggc cag att ctc cag atg acc cgg aag cac ttg gag 1440 Asn Gln Gln Tyr GlyGln Ile Leu Gln Met Thr Arg Lys His Leu Glu 385 390 395 400 gac acc gcctat ctg gtg gag aag atg aga ggg caa ttt ggc tgg gtg 1488 Asp Thr Ala TyrLeu Val Glu Lys Met Arg Gly Gln Phe Gly Trp Val 405 410 415 tct gaa ctgcat gaa gga aat att tcc aaa caa gat gaa aca atg atg 1536 Ser Glu Leu HisGlu Gly Asn Ile Ser Lys Gln Asp Glu Thr Met Met 420 425 430 aca gac ttaagc att ctg cct tcc tct aat ttc aca ctc aag atc cct 1584 Thr Asp Leu SerIle Leu Pro Ser Ser Asn Phe Thr Leu Lys Ile Pro 435 440 445 ctt gaa gaaagt gct gag agt tct aac ttc att ggc tac gta gtg gca 1632 Leu Glu Glu SerAla Glu Ser Ser Asn Phe Ile Gly Tyr Val Val Ala 450 455 460 aaa gct ctacag cat ttt aag gaa cat ttt aaa acc tgg taagaagatc 1681 Lys Ala Leu GlnHis Phe Lys Glu His Phe Lys Thr Trp 465 470 475 taatgcatcc tatatccagtaagtagaatt atctcttcat ctgggacctg gaaatcctga 1741 aataaaaaag gataatgcaataaacacagt tgcaggaaag tatgttagct atatactatg 1801 aagtactctt agtttacttatgttgaatgg cttagctatt aatactcaaa ttgagttaaa 1861 atgaaaattc ctccttaaaaaatcaaacgt aatatgtatt acatttcatg gtacattagt 1921 agttctttgt atattgaataaatactaaat caccta 1957 4 477 PRT Homo sapiens 4 Met Arg Thr Trp Asp TyrSer Asn Ser Gly Asn Met Lys Pro Pro Leu 1 5 10 15 Leu Val Phe Ile ValCys Leu Leu Trp Leu Lys Asp Ser His Ser Ala 20 25 30 Pro Thr Trp Lys AspLys Ser Ala Ile Ser Glu Asn Leu Lys Ser Phe 35 40 45 Ser Glu Val Gly GluIle Asp Ala Asp Glu Glu Val Lys Lys Ala Leu 50 55 60 Thr Gly Ile Lys GlnMet Lys Ile Met Met Glu Arg Lys Glu Lys Ala 65 70 75 80 Asn Gln Ala ProGlu Thr Glu Ile Ile Phe Asn Ser Ile Gln Val Val 85 90 95 Pro Arg Ile GluHis Thr Asn Leu Met Ser Thr Leu Lys Lys Cys Arg 100 105 110 Glu Glu LysGln Glu Ala Leu Lys Leu Leu Asn Glu Val Gln Glu His 115 120 125 Leu GluGlu Glu Glu Arg Leu Cys Arg Glu Ser Leu Ala Asp Ser Trp 130 135 140 GlyGlu Cys Arg Ser Cys Leu Glu Asn Asn Cys Met Arg Ile Tyr Thr 145 150 155160 Thr Cys Gln Pro Ser Trp Ser Ser Val Lys Asn Lys Ile Glu Arg Phe 165170 175 Phe Arg Lys Ile Tyr Gln Phe Leu Phe Pro Phe His Glu Asp Asn Glu180 185 190 Lys Asp Leu Pro Ile Ser Glu Lys Leu Ile Glu Glu Asp Ala GlnLeu 195 200 205 Thr Gln Met Glu Asp Val Phe Ser Gln Leu Thr Val Asp ValAsn Ser 210 215 220 Leu Phe Asn Arg Ser Phe Asn Val Phe Arg Gln Met GlnGln Glu Phe 225 230 235 240 Asp Gln Thr Phe Gln Ser His Phe Ile Ser AspThr Asp Leu Thr Glu 245 250 255 Pro Tyr Phe Phe Pro Ala Phe Ser Lys GluPro Met Thr Lys Ala Asp 260 265 270 Leu Glu Gln Cys Trp Asp Ile Pro AsnPhe Phe Gln Leu Phe Cys Asn 275 280 285 Phe Ser Val Ser Ile Tyr Glu SerVal Ser Glu Thr Ile Thr Lys Met 290 295 300 Leu Lys Ala Ile Glu Asp LeuPro Lys Gln Asp Lys Ala Pro Asp His 305 310 315 320 Gly Gly Leu Ile SerLys Met Leu Pro Gly Gln Asp Arg Gly Leu Cys 325 330 335 Gly Glu Leu AspGln Asn Leu Ser Arg Cys Phe Lys Phe His Glu Lys 340 345 350 Cys Gln LysCys Gln Ala His Leu Ser Glu Asp Cys Pro Asp Val Pro 355 360 365 Ala LeuHis Thr Glu Leu Asp Glu Ala Ile Arg Leu Val Asn Val Ser 370 375 380 AsnGln Gln Tyr Gly Gln Ile Leu Gln Met Thr Arg Lys His Leu Glu 385 390 395400 Asp Thr Ala Tyr Leu Val Glu Lys Met Arg Gly Gln Phe Gly Trp Val 405410 415 Ser Glu Leu His Glu Gly Asn Ile Ser Lys Gln Asp Glu Thr Met Met420 425 430 Thr Asp Leu Ser Ile Leu Pro Ser Ser Asn Phe Thr Leu Lys IlePro 435 440 445 Leu Glu Glu Ser Ala Glu Ser Ser Asn Phe Ile Gly Tyr ValVal Ala 450 455 460 Lys Ala Leu Gln His Phe Lys Glu His Phe Lys Thr Trp465 470 475 5 1485 DNA Homo sapiens 5 atgaaaatta aagcagagaa aaacgaaggtccttccagaa gctggtggca acttcactgg 60 ggagatattg caaataacag cgggaacatgaagccgccac tcttggtgtt tattgtgtgt 120 ctgctgtggt tgaaagacag tcactgcgcacccacttgga aggacaaaac tgctatcagt 180 gaaaacctga agagtttttc tgaggtgggggagatagatg cagatgaaga ggtgaagaag 240 gctttgactg gtattaagca aatgaaaatcatgatggaaa gaaaagagaa ggaacacacc 300 aatctaatga gcaccctgaa gaaatgcagagaagaaaagc aggaggccct gaaacttctg 360 aatgaagttc aagaacatct ggaggaagaagaaaggctat gccgggagtc tttggcagat 420 tcctggggtg aatgcaggtc ttgcctggaaaataactgca tgagaattta tacaacctgc 480 caacctagct ggtcctctgt gaaaaataagattgaacggt ttttcaggaa gatatatcaa 540 tttctatttc ctttccatga agataatgaaaaagatctcc ccatcagtga aaagctcatt 600 gaggaagatg cacaattgac ccaaatggaggatgtgttca gccagttgac tgtggatgtg 660 aattctctct ttaacaggag ttttaacgtcttcagacaga tgcagcaaga gtttgaccag 720 acttttcaat cacatttcat atcagatacagacctaactg agccttactt ttttccagct 780 ttctctaaag agccgatgac aaaagcagatcttgagcaat gttgggacat tcccaacttc 840 ttccagctgt tttgtaattt cagtgtctctatttatgaaa gtgtcagtga aacaattact 900 aagatgctga aggcaataga agatttaccaaaacaagaca aagctcctga ccacggaggc 960 ctgatttcaa agatgttacc tgggcaggacagaggactgt gtggggaact tgaccagaat 1020 ttgtcaagat gtttcaaatt tcatgaaaaatgccaaaaat gtcaggctca cctatctgaa 1080 gactgtcctg atgtacctgc tctgcacacagaattagacg aggcgatcag gttggtcaat 1140 gtatccaatc agcagtatgg ccagattctccagatgaccc ggaagcactt ggaggacacc 1200 gcctatctgg tggagaagat gagagggcaatttggctggg tgtctgaact ggcaaaccag 1260 gccccagaaa cagagatcat ctttaattcaatacaggtag ttccaaggat tcatgaagga 1320 aatatttcca aacaagatga aacaatgatgacagacttaa gcattctgcc ttcctctaat 1380 ttcacactca agatccctct tgaagaaagtgctgagagtt ctaacttcat tggctacgta 1440 gtggcaaaag ctctacagca ttttaaggaacattttaaaa cctgg 1485 6 1431 DNA Homo sapiens 6 atgaggacct gggactacagtaacagcggg aacatgaagc cgccactctt ggtgtttatt 60 gtgtgtctgc tgtggttgaaagacagtcac tccgcaccca cttggaagga caaaagtgct 120 atcagtgaaa acctgaagagtttttctgag gtgggggaga tagatgcaga tgaagaggtg 180 aagaaggctt tgactggtattaagcaaatg aaaatcatga tggaaagaaa agagaaggca 240 aaccaggccc cagaaacagagatcatcttt aattcaatac aggtagttcc aaggattgaa 300 cacaccaatc taatgagcaccctgaagaaa tgcagagaag aaaagcagga ggccctgaaa 360 cttctgaatg aagttcaagaacatctggag gaagaagaaa ggctatgccg ggagtctttg 420 gcagattcct ggggtgaatgcaggtcttgc ctggaaaata actgcatgag aatttataca 480 acctgccaac ctagctggtcctctgtgaaa aataagattg aacggttttt caggaagata 540 tatcaatttc tatttcctttccatgaagat aatgaaaaag atctccccat cagtgaaaag 600 ctcattgagg aagatgcacaattgacccaa atggaggatg tgttcagcca gttgactgtg 660 gatgtgaatt ctctctttaacaggagtttt aacgtcttca gacagatgca gcaagagttt 720 gaccagactt ttcaatcacatttcatatca gatacagacc taactgagcc ttactttttt 780 ccagctttct ctaaagagccgatgacaaaa gcagatcttg agcaatgttg ggacattccc 840 aacttcttcc agctgttttgtaatttcagt gtctctattt atgaaagtgt cagtgaaaca 900 attactaaga tgctgaaggcaatagaagat ttaccaaaac aagacaaagc tcctgaccac 960 ggaggcctga tttcaaagatgttacctggg caggacagag gactgtgtgg ggaacttgac 1020 cagaatttgt caagatgtttcaaatttcat gaaaaatgcc aaaaatgtca ggctcaccta 1080 tctgaagact gtcctgatgtacctgctctg cacacagaat tagacgaggc gatcaggttg 1140 gtcaatgtat ccaatcagcagtatggccag attctccaga tgacccggaa gcacttggag 1200 gacaccgcct atctggtggagaagatgaga gggcaatttg gctgggtgtc tgaactgcat 1260 gaaggaaata tttccaaacaagatgaaaca atgatgacag acttaagcat tctgccttcc 1320 tctaatttca cactcaagatccctcttgaa gaaagtgctg agagttctaa cttcattggc 1380 tacgtagtgg caaaagctctacagcatttt aaggaacatt ttaaaacctg g 1431 7 72604 DNA Homo sapiensmisc_feature (1)...(72604) n = A,T,C or G 7 acattttaag ctacttatagtccttggaaa tagcaacaaa tatcttagtt attggactat 60 tataacctta gtcatcttattactgcttga ttatgagaca ctctccctgc taatccttag 120 aacatcttgg ttcttggtacttgactttta gcccctctga catatagttg atgtcagagt 180 gtctggcatt tcagtagtgctctattttac aaatcccagt aaactgctcc actgtggctt 240 gtttatgtgt taatactgcttgttttctgt tataaattat tttttgcttt ggagtaagat 300 atcatcattt tgcatagctacaaatctgaa gttaaagaaa attttaaaaa tgtaattgtg 360 ggaaaataac aaatagatctgctgagatgg aggctttgac taatgtttta ataacaggca 420 acaaaacaaa gaggcaggatattttggtca caactaaacc taaattaaat cctcatacaa 480 agccccatta agataaatgctcaaattctg ggaacatttc acttgctttg ccagcaattt 540 tacccttcag agggtgtggatctaatcagg ggaacaaact accctgggct taattctcat 600 taacagggac taatttgtcaaagcggcagt actagctgaa gtgatgggta tggaagcatt 660 cactgtgagg attttgctgaggtgcctggc acagggtagg ggaactcacc caggctgcaa 720 gatgctaaca gttcaggttcaaggtcttag tgtggactaa ggtgcagtca ggatgggaac 780 aggtgcaact tgggccaacatcagtatgaa gggcctgatc tgagggcagg ggaaggaggg 840 ggcattctgg gaagcaagagttcctggtat cctgttgacc agagtcttgg cccaaggatc 900 aacgtatgaa ttaaagtagaaataccagaa acaaagaaag ttggcagaaa ctaggagaag 960 cagagtctca gccaactggactgggctcag ccttggctac tggcccggca gatgatagaa 1020 gagaaaacca ggaacccaggctgaagccca gtggttgggc tggccacaca ccatgcatag 1080 ccttaaaggg gtggcctaagggcatggtcc gctccaaaaa aggaaagggg gccccagaat 1140 atttctgaat cccactcactgccagggaag aacctctcaa ttcactcaat agtgcattct 1200 cctgcttctc aataggctaatactctagag aatatgggga caaggggagg agggtctagt 1260 ggaacaggtc taaactggcgtttgaatttt aagataagtt aatcatacat tggctgggtc 1320 agccatgtct cttagtctttacaaaagtag aacacaaaaa aattcaatgg aaatctacag 1380 acacctattt gcagatgaggaaacacggct atgaagattg ggaagattgg gaagaactgg 1440 ccaggtgtgg tgctcacgcctgtaatccca gcactttggg aggccgaggc tggtggatca 1500 cttgaggtca ggagttggagaccagcctgg gcaacatagt aaaaccctgt ctctactcaa 1560 attacaaaaa tcagcagggcgttgtggtgc ccacctgtaa tcccagctat gcaggaggct 1620 gaggcaggac aatcacttgaacctggtagg cggaggttgc agtgagccaa aatcacgcca 1680 ctgtactcca gcctgggtgacagagcaaga ctttgtttaa aaaaaaaaaa aaaaagggaa 1740 gaactaaaaa tgtaattttcaaggggctat cacaaatggt cccaataaag agaaagcagg 1800 actcatgttt aagaaacccatgagatgtgt atggacctca tggaagagct cttgctttct 1860 aatgatctac gtaacagatgaaaagcagag catagggcta aggatgaaaa tacaacagta 1920 ataaggtatt aatatattattaagaaagct aatgctccac ataagcagag gacattaaag 1980 ggactttttt ttcttaaggatatcttaatg ttttaaatga gaagacatag aaagggatag 2040 gtccaactct tgggattgttgcaggttggt ttccatcgga agcactctga gtctgagatt 2100 tgtatgcaga aaattaatttgaatgtgctt ttcagatcac ccaggtgggg gagggaggaa 2160 accaggactg ggcagagagaggctgggctg taaccaagtc acaacaaagg tgtcagctgg 2220 tcccatggtg aattctggacctaggatggc tgatcccaag gcattccaaa ctggggcaag 2280 gaagttgtgc tttaaaacttctcattgact gtcagtcact gggcatgagc agtccccagg 2340 aaggggggat gaccttgagcaaggtggatg tcttcagcca agggcaayca ctgggaagga 2400 gaacccagct atgaactgtcagctgccaac actcccagca tctgagagga tgagggcttc 2460 aattctaagg gcaggggctccaagggcagg ggtacggatg gtggaatctg ggcagtacct 2520 tgtggcttcc actacagtccaccccttgca ccacttagtt ccactggctt tttttttttt 2580 tttcttttct gagacagtctcactctgtca cccaggctgg agtgcggtgg cacgatctcg 2640 gctcgctgca acctccgcctcccaggttca agcaattctt gaacctcctg agtagctggg 2700 actacagatg tgtgccaccacacccagcta attttttgta tttttagtag agacggggtt 2760 ttaccgtgtt agccagattggtctcgatct cctgacctca tgatccgcct gctttggcct 2820 cccaaagtgc tgggattacaggtgtgagcc accgcacaca gccagatcca ctggcttcta 2880 tataatttct gggtgaagctaattcaggat tctgatggac ctgtcttccc gagggaaact 2940 tgtaaaagga aagttagagggacaaactat agcccctgcc acagcagctg ctgtcgagga 3000 caaaaatggt gctcctcatttcccttaacc acctgaccta gattccccta acccttagtg 3060 ggcacctctg tggatggaagtggtggctca cykgkkggrw krwycmrrwy ycwymyccct 3120 gagtggtctg agctcccagttaccaggccc ttctcaggct gtggctgttg cacttacctc 3180 cccagccatc ccccacttttttttcttgag actgggtctt gctctgtcac ccaggctgaa 3240 atgcagtggc ataacctcagctcactgcag ccttgatctc ccaagctcaa gccatcttct 3300 cacctctgcc tcccaagtggctgggactac aggcacatgc caccatgccc agctaatatt 3360 ttttattttt tatttttttgtagcaatggg attttgccat gtttcccagg ctgggcttga 3420 actcctaagc tcaagctatcctcccacctc tgcttcccaa agtgctggga ttacaggctt 3480 gagtcactgc atctggccacatttattcct tttaaacgtt aaaattgaat gcaggatcac 3540 tgagagacag gtgagtgattaccagggtgc caaacatacc cttctcctcc tttcctgcag 3600 ctctacctcc tcctgatgatcaggacaatc atgtatgatg actcctttcc ttgactgctg 3660 ctctctcaga aggaacccattgtgttgggt gagaacacat catttgaaat ttagtaagac 3720 tcttgctgtg cctatggtagaagcattccc tctctggggc caagatcttt aaatgcacag 3780 agtccaaagt cgtgggaaccaaagcagaaa ttaaaaagga gatgactggg attatggtaa 3840 gaactgtttc cacccttgatttgctgcacc catgtgttct acctaggaga tagcacacca 3900 tatactggtt attcatttggattacatgct gcatcccgga gaatgggcac tgcattctca 3960 ctggtcatca tgtcagagcctgcgctgcag aggctttccc attgctctgt cagtgtgtta 4020 tagggtcagt ggatttcatggtcatgtgcc cactgctgca cctccattct tgtaaaatgg 4080 gtcctctggt tcaatgtgatgccatgtggg atcttgtgtc aatagaataa atactcagat 4140 gttctggctg aagctttacaagcagaaaag gccaaccgat gactgaaata agcgttgagc 4200 ccagtcaaga tgagttcctgctctttccag gatagacgga gtctagtgta gatcacttga 4260 catcaagaga ctggctggtctccttgaggg atggtgctgt tctgcattca tcatccttga 4320 tgaatgaggg accctgctattgggctcatg tacagccccc atctctgcca caatgagcgc 4380 tccattcatg ttcctattgtgccaacacta gggtgtctgt aatcactgaa aacattattg 4440 ctatcattat tattattttttttttttgag acagagtctc gctctgtcgc caaggctgga 4500 gtgcagtggc acgatctcagctcactgcaa cctctgcctc ccggcttcaa gtgattctcc 4560 cgcctcagcc tccagagtagctgggattat aggcatgcgc caccacgcct ggctaatttt 4620 tgtattttta gtagagacagtcttttgcca tattagtctg tctggtctcg aactcctgac 4680 ctcaggtgat ctgcccgccttggccttccg gagtgctagg attataggcg tgagccacca 4740 cttgctatta ttatgttgagaaaactgttt tcaattataa ataagaaaaa ataaaagatt 4800 atattttgcc tttattccttctctaatgct gttctttaag tagatgtgaa tttctgaact 4860 acatactttt tctttactcttgagaggttg tttggaggtt ccagcagggg accacagcta 4920 ctcgtatacc cttgaccaaagactggtcct tgtctatcaa ggatggtcgt cttcttccac 4980 caagcacaca gcttctggagggacgcacat ggagtggtga gggaggaagg ggacacccgc 5040 ctagccagct agatcagccaagcagaataa accctggtag tcaatggggt gacagtgtcg 5100 cagccagatt gccctcacatccaactctta gtgatcttct cttaacattt cttgcaaggc 5160 aggtctactg gtacaaattctctaattttt gcttgtttga gaaagtcttt gtttcttctt 5220 cacctttttt tttttttttttggagacaga gtctccctct gttgtccagg ctggagtgca 5280 gtggcctgat cttggctcactgcaaactct gcctcccagg ttcaagtgat cctcattcct 5340 cagccatctg agtagctgtggttacaggcg tgtgccacca tgcctagcta aattttgtat 5400 ttttagtaga gacgaggttttaccgtgttg gccaggatgg tcttcagcct tcttaacttt 5460 taaaggataa tttcacggggagaattctag gttagtgtat ttytctttca atactttaaa 5520 tatttcactc cactttcttcttgcttgtgt ggttctgaag ataatgatat aattcttatt 5580 cttgtttctc tgcaggtaaggtggtttcat acctctggct tctttcgaga atttctcttt 5640 gtctttgatt tcctacagtttgaatatgat ataattatgt atagacttgg ggctatttat 5700 cctttctggt gtagtctgagctccctaagt ctgtggtatg gtgtcttgta attgatttgg 5760 gaaaattctc agtcattattacttcaaata tttcttctgt tcctttgtgt ttttttaact 5820 tgtgccaact ttttaattgatacatagtat tttacatatt tatggggtac atgtgatact 5880 tcattacctg catagaatgtgtaaatgatc tagtgaaggt gtttggacta ttaccttgag 5940 tatgtatcgt ttctatgtgttgggagcttt tcaagtcctc tcttgtaaca attttgaaat 6000 atacaatgcc ttgttgttaactagtcaccc tgctctgctc tcaaacacta ggatttattc 6060 cttctgtcta actgggtgtttgtacccatt aaccaacctg tcttcatccc ctctacccac 6120 atacctttcc cagccttgggtatctatcat tctactcttt acctccatga gatcagcctt 6180 tttaactccc acatatgagtgagaacatgt agtacttgtt ttgccgtgtc tggcttattt 6240 cacttaagat aatgaccttttattccatcc aggtcactgc aaataacaag atttcattgc 6300 tttttctttt tatggccaaatagtgttcca ttgtttatat agaccacatt ttactttatc 6360 catttgtaca ttgatgaacactgaggttga tccatatctt ggctattgtg aatagtgctg 6420 caataaacat gggggtgcaggtatcccttt aatataccga tttcttttcc tttggataaa 6480 tacccagtaa tgggattgctggatcatgtg gtagatgtat tttaagtttt ttgagaaacc 6540 tccatactct tccatcatggctgtattaat ttacattccc atcaatagta tatgagttcc 6600 cttttttttc tgcatcctcaccagcatcta ttatttttgt ctttataata atggcctttc 6660 taaccagggt aagatgatatctcattgtgg ttttgatttg catctccctg atgagtagtg 6720 atgtcaagcg tttttccatatgcccattgg ccatttgtat gtcttctttt gatgaagtct 6780 gtttgtgtcc tttgcccactgtttatgctc ctttttttct tctctctctg gtatccccct 6840 cacacatata tcagaccttttttaattgtc ccacaattct tgcattttct gttctttttc 6900 attctttctt ctctttgtatttcagttttg gaagtttcta ttgatattca agctcactga 6960 ttcttcctct ggctctgttcagtctattaa taagcccttc aaagcctttc tctctctttc 7020 tttctttctc tctctctctttctctctttc gttctttctt tctctatttc cttcctttct 7080 ttctttctct ttctttctttctttctcttt ctttctttct ttttctttcc ttccttcctt 7140 ccttccttcc ttccttccttcctttctttc tttctccttc cttccttcct tccttccttt 7200 ctttctttct ccttccttccttccttcctt ccttccttcc ttccttcctt ccttccttcc 7260 tttctttctt tctttctttctttctttctt tctttctttc tttctttctt tccttctttc 7320 tttcgaccag ttctcactatgttgctcagg ctagcctaga acccctgggc tcaagttatc 7380 ctctcagctc agcctttcaagtaggtggga caaatgcgcc attctatcat acccaacaat 7440 tcctcatttc tgttacagtggtttttattt ctagcatttt cttttgattc tttcctagag 7500 tttccatctc tctgcttacatacacatttg ttctctcata ttttccactt tttccattag 7560 ggccttcagc atattaattagttattttca attctagcct gataattcca aaatctcggt 7620 tatatttgag tctgtatctatgcttggttt gtctcctcag actgcgtttt ttccttttag 7680 gatgtccctt atcattttttgttgaaaaca agacatgatg tatcagataa aagtaattga 7740 ggtaaacagg cctttaatatgaggttttat gtttatctgg cttggagtta ggctgtgttt 7800 actctttgct gtaactttggtgccagaggc taaaatttcc tctggtgccc ttgtttttgt 7860 ctctcctgtt atgtttgtgtttccacagag tctccgtgaa tatggtgtga ggcttgaagt 7920 tctttagctg taacccctcttattatacag gagccttacg gatgtggtgg taatgtggga 7980 gggtgggctt aagtattcagcagtcctgtg atcaggcctc agtcttttaa taagcctgag 8040 tacttccctt tccctttctgcatgttagag tggcctggag ttgggggtat ccattacccc 8100 aggttggtag gctttggtaaaaccacagtc tatcaagctg tggtaaaata gtttccctgc 8160 agtctggctt tgttaaggataacagagggc tctgggggtg tttcaaaatt gctacttttc 8220 ctctctccct gtcagaagcacaaggagatt tctcttgatc ttcaccctga gagtctggtg 8280 gggttcctgg aggtaaaactcaggaaagtg tgagggcctc cacacaaagg gtctgctgaa 8340 gtttgttcca tagcctcagttctctaatgg atctaagaag agttattgat tttcaatttg 8400 tccaacttaa ttcttgttttgaagacagaa gtgatgactt ccaagctctt tatatgttga 8460 acccaacccc atattattttcaattagcaa ttgcatatag caatggtaca ttgcatttat 8520 agaaatataa ttgatgtttgcctgtgtatc ttttttccta ttatgttgct gaattcattt 8580 cttagttcta ggaatttttcaaatacatcc cttaggatat tctgtataca taatcatgtc 8640 atctgcacat agggacagttttatttcttt ttctagtctg tatttcttat ttccttttct 8700 tgccttattg cagtggctagaacttgcagc actatattaa aataagagtg gtaaaagtga 8760 acattctttc tttgttgctgatcttggggg gaaagtattc agtctttcac cattgagcat 8820 aatgttagct gtaggtgttttaaatcttta tccagttgac gaagttaccc tttattccaa 8880 tttttctgag agtttatatcataaatgtgt taaattttgt caaatttttt tgcatgtatt 8940 gatatgatta tgtggtttttcttctttagt tactgcagtg ggttgcattg attgatttct 9000 attattgaac cagcctgcattcctggaata aaccccattt ggtcatgatg tataattctt 9060 ttttttatat tgctgaattctatttgctaa tattttgtta aggatttttg catctgtgtt 9120 catgagggat ctgggctggtaggttttttt cccccctgca atgtctctgt ctggttttgg 9180 tattaaggta attttttttkttttkttttt gagatggagt ctcgctctgc tcacccaggc 9240 tggagtgcag tggcacgatcttggctcact gcaacctcca cctcccaggt ttaagcgatt 9300 ctcctgcctc aggctcctgagtagctggga ctacaggtca caccaccacg cccgactaat 9360 ttggtattaa ggtaatattatcatcataaa atgaactggg aagtgtgccc tcttcttgta 9420 tttctttttt tttttttgagacagtcttgc tgttgcccag gctggagtac agtggtacga 9480 tcatggctca ctgcagcctcaaactcccag gctcaagtga tcttcctgcc tcagccttcc 9540 cagtacaggg gcaggctaccacatctggcc aatttttaaa tttttctttt gtagagaggg 9600 gtctcactat gttgcccagaggatctcaag caattcacct accttggccc ctcttcttgt 9660 attttatgga agaattattggtgtcaattc ttcttgaaag tttcgttaga attcttcagt 9720 gaagctgtat gggcttgaagattacttttt tttctttttt ttttgagatg gaatttcact 9780 cttgtcgccc aggctgtagtgcagtggtgt gacctctgct cactacaacc tctgcctccc 9840 acgttcaggt gattcccctgccttactcag cctctggagg agctgggatt acaggcaccc 9900 gccaccatgc ccggctaattttttgtattt ttagtagaga cggggtttca ccatgttgac 9960 cagactggtc tcgaactcctgacctcaagt gatccacccg cctcggcctc tcaaagtgct 10020 gggattacag gcatgagccaccgcgcccag ctgaagattt ctttttgggg agttttaaat 10080 tatacaatca atttgcttaataggtataag ctattcaagt tatctatttt atactggatg 10140 agttgcaata gtttgtggtttatgagttta tatggtccat ttcatctgag gtataaaatt 10200 tayttgtgta gtattgttggtagtattccc ttgttatctt ttttatgttc acatggtata 10260 tggtgacagt cctggtttaattcctagtat tagtaactgg ctctctctct ctctctctct 10320 ctctctctct ctctggtcagtctttccaga ggtttgtcaa ttttgttgac ttttttcccc 10380 caaagaatca gctctttgtttcatggattt tctgcttttc tgttttcaac ttcattgatt 10440 tctgctgttt attatttctctccttctgtt ggttgtgagt ttgttttgct tttctttttc 10500 tacatattcg atgtgaaatcttacattatt cactcgggac ttttcttctt ttttgatgta 10560 tgcatttagt attctaaatttacttctkag tactgcatac tgcttgaact atgtctgaca 10620 aatattaata tattgtttttaaatctttat tcagttcagt gtatttttaa aatttccttc 10680 tctgcctctt ctttgatttgttatttagaa ttgtgttgtt attttccgag tatttacatt 10740 ttcctcttat ctttctgcattgattccatc gtagtcagag tgcatgctct gtacagtttc 10800 agttctttca aatttattgagctttgttta atggatctgg atacagttta tcttggcata 10860 tatatatata tacacacacatatgtatgtg ggcgcttgaa aagaaagcgt atctgctgtt 10920 tggtggaatg tttggagtgttctataagcg gtgattagat actgttggtt gatgatgtca 10980 ttgagggtcc gataaccctactgatttaaa tttatttagt ctgtcaatta ttcagagaga 11040 gaggtgttga actctgcaatgtgaattgtg gatttgtcaa tttctccttt cagttctatt 11100 agttttttct tcacatattttacaactctg ttgtttggtg catacacatt tatgcaccaa 11160 atttaggatt gctataacttcttggtggat tgaccctttt acattatata atgtcttttt 11220 ctgtccctgg taattgtggttgctctgaag tctatgttat ctcaatataa atagacaact 11280 ctgctttctt ttgattaatgtttacatgat acatcttttt ctattctttt actttcaact 11340 tacttatatt attatgtttgaagtgagctt cttgtagaca gcatgtagta ggtcatatat 11400 gtacatagat atatatatttttttgagatg gtgtactctg tcacccaggc tggagtacag 11460 tagtgctcac tgcaacctctgcctcctggg tcaagtgatc tcgtgcckca gcckccccag 11520 tagctgggat tacaggcacgcaccaccatg cccagctaat ttttgtattt ttagtagaga 11580 cgggtttaac catgatggacaggctggtct cgaactcccg acctccagcg attagcccac 11640 cttggcctcc caaagtgctggcattacagg tgtgagccac cgtgcctggt ttaatatttt 11700 taatccactc agtctttgtcttctactggt gtacatagac attcgcatgt aatgtaaatg 11760 ttgatatgta agagcttgaatctgttatgt ttttgctttc tctatgtttt ctcaattttt 11820 aatttctctg ttttctttttttctgcttca tattggctaa tgaacacttt gaatcattcc 11880 attttgattt acctatagtgttttttagtg tgtctctttg catagctttt ttaggggtta 11940 ctttaagtat ttcattatatgtacataact tatcacagta tattggtatc gttattttac 12000 cagttcaagt aaagtatggaaatgtttcct ctctacattc ctttacctca tttataatat 12060 aattgtctta ggtatttcttgtacatacat tttaaaccgg atgagtgtta tttttgattt 12120 agctatcaaa taattccaaaaactcaagaa aaaaaggaaa gcttactata ttgacccata 12180 ttttcattca ccatgttgtttcttccctct ttatgcccca tagttccttc ttctattgtt 12240 ttcgtttaga gaacttcctagccattctat tggggtagat ctcctagtga caaattctct 12300 tagctttctt ttctctgtgaatgtctttat ttccctcttt gttcctggag gacattctca 12360 ctggatatag gattcttggctattgggtct tttcttttgg cacttttgta agtgtgcagc 12420 ctgctgtcaa aataaaaattaaaataaaat aaaaatgaat gttttccttt gctacgttca 12480 tgaaagtata attcactgaatgaggaggga cacccatctc tataatctgg aggcccatgc 12540 tcacctctga atagtacatttgcagagaaa ttggggaaat caaagtctgt tgagaccagc 12600 aagataaata aggcaaaaggatacaaaacc atatccaaag agaaatggtt taaaggaact 12660 aaggctgttt ctcctaaaaagaaaatagtt ggagacatgt gacctccaaa gaaacaggac 12720 tttttctatg gggctccaaggggtttctat gagagaatga taaaggagag atttcagctt 12780 agtctcagga agacttttcaacaaccaaac ctgcccaaag atggactgcc ctgcctaagg 12840 attgtgttct gacattaagggtatggaggt atgggttaga tgaatatttt accaaaatgc 12900 catagatatt tcaggctattgatgttgtaa tatcatacta ggcaactcca cttcaatatg 12960 agtctctatg atgtaaaatgaaataggatg tgtttcgata gagagttgca gatttcattt 13020 tgatgttagc gaccacacaaaattactttc cctacataag aacatgttat tactctagtt 13080 gatgatgact gcttatgggaaatgtgtctg ctttgttagg aatcttgcct aatatatgta 13140 taattcaaga tggtattataaagtgacata tatgatttta acatttgcac ttaaaataac 13200 acttattctg taccatgmastgtctaggag cttctacata ttccattatt atctttattt 13260 tacaagacag ggaactaaggcatggagaga ttgagtaatt tgtgcaatat tacctaccta 13320 gtaagtggta aaggaaagattggaacccat tctggctcca ggatccaggc tcaaagccaa 13380 tatactatcc accaccccaactctttagtt tgatcaattt gtcaaattat tttacagtta 13440 tttatctgta aattaaggggataattgccc agtcaataaa tgtgtcccct tcaaaggtta 13500 catacttaac caatggtgctactgggctca gaacattttt ggaactacga ttttggtggc 13560 aaccaaaaaa cctccagtacattcctctga acattctcca gaggcaagtc tttctccatg 13620 gagactgggc ttcattttttgaattagcct gaagttgttt gaggtcaaat ctgatgaaaa 13680 gagcggctgg ggaagctggatattttcgtt cgtgatttaa aacagtaaat gccacctaaa 13740 tgagaaggct actttctttgaatgttttgt aaactggctt tgaaggtact tctttaaaaa 13800 agaagcacaa gaaagacggtgactggcaac agcctcactg gaatacgtct ctaatcatca 13860 aggcaaccca cactcatttggatgtgtgca tccggtgatg ttattatttt taaagttatg 13920 tgccacaaag atgcattctttgctatacaa aagagctgtt gttaaattta taaagatata 13980 aaaaggggaa aggagaaggcaccaaatgga agattcttag gcattaagtg ctcagacagc 14040 atagatcttc attagatgacgtcagggaga agagacacag actttgccat ctcaggtaga 14100 agtatcaaag tcatcagcctcctagtaaga cagacctggg tttgaagctc tgcacagcca 14160 tttcctagct ggtctggggaaaaattactt cttgaagcct cagtgtcttt atttgtaaag 14220 taagtggaat tatattaccttgtcaggatg ttgtcagaat tagaaataat ttaaagaggt 14280 ccagcacgag caggtcaatcaagggaagat gttaaaaata acaacaggtg aaatgtactc 14340 ccaaaagata aagtggatacatagatgaat cttcctcaca cacagagtat aataacctca 14400 gaaaaatatt gcctagagtaaacatgcctc ccaagccaac gttcatcatc caggaatacg 14460 gagaggatgt ttgggatatggggggcatga aattttacaa ttgtagggcc ctttaacaag 14520 ggtagacttg caagttgcactgmctttcct gccctcctct ggctacctgt tccagcatcc 14580 agagtttgtg aacctggggmccaaggacag caccctggca tgggcaggcc cactnggcga 14640 ctctctcagg gctgctgcagctgtgtcagt gtccccacag ggagnctgac atccagccat 14700 gaccatcgca ttaagcccagcagtcagggc aggggagcaa ctgctcagag gcacctttga 14760 cccactactt ttttcccctcctgctttatc tgcccagagc gaggctctct ttctaatgtg 14820 tacaaggcgt tctacctatgactcgtggtc ctgccataga aatgcttttt ttttttttaa 14880 ctgaattaag ttgccaagtttgaaaaatca gaatttcaca taagatccct atttctgtct 14940 tcttttgaaa aactgaatgttctttccaca gtgagcccac attccttcct gacgaccatc 15000 accgttcagc tggagtagagagggctctgc tggcttcaga tccggacgcg caggtcctct 15060 gcaggccccg cccacccggcgtcacctgca ggtcccgccc acccggcgtc tgcaggcccc 15120 gcccacccgg cgtcacctgcaggccccgcc cacccggcgt ctgcaggccc cgcccacccg 15180 gcgtcacctg caggccccgcccacccggcg tctgcaggcc ccgcccaccc ggcgtcacct 15240 gcaggccccg cccacccggcgtctgcaggc cccgcccacc ctgcgtcacc tgcaggcccg 15300 ggccgcgggg ttggtttccaccmtggaggt tgctgacacc ctgtgccctc ggctgacttc 15360 cagccggtgg cacagacgcctccagggggc agcactcaag cgcatcttag gaatgacagg 15420 tgagarcatc ctccgggccccagatttctc tcctcgccgc tcttgcccat ttctccggag 15480 agccagagaa agccgctcccaagtccaagg ccgagctccg cagacgcccg gcccctccgg 15540 cgcggacaga acaaagccattgttcttgcc ggggaaggta gaaatactgt gggctgcttc 15600 agaggctgcc gagcaaaactcaggcaatct cctgggctgt tccaatacgt ttattctctt 15660 tttcaaaaca ggaggaggaggtagaggcgg ggagacacac catccctgca aaactactgg 15720 caaaaactaa gcggagccgggtgtggtggc tcacgcctgt aatctcaaca ctttgggagg 15780 ccgagggggg ccgatcacttgaggtcagga gttggaggcc agcctggccg gcatggtgaa 15840 acacaaaaat tagtcgattgtggtggtgca tgcttgtaat cccatctact tgggaggctg 15900 aggcaggaga atcgcttgaacccgggaggc ggaggttgca gtgagccgag attgcgccac 15960 tgcactccag cctggacaacacaagtgaga ttctgtctca aaataaataa ataaataaac 16020 ccaagcagaa aaagaatcactctgaaaacg atcacatcta actatcaatg ctcatacagt 16080 ttatggaatt atcagcccaacttgataaaa tcagtatttg aggaaactgt ggataagccc 16140 cctgatttca atccccattgtgccaggtcc tggttaactg aggttaacga agtaaagagc 16200 tgcagacact attaactgctaccttaaacc gattactcta gcttagccta ctttccacgt 16260 acagatttta ccagtggacaacatgatgct ttatcttgtt tttctctccc tgggactttt 16320 ctccagacat tgaaaacagaaatactaata aggccacttt tacctgcctg atgcaagaac 16380 agaattttca aactcaacattaatgcaact cctcagtccc tgacaatggc gggtggaaaa 16440 gtttctaaaa atatgcagcagcacaattat cgggaagaga tgagatactg ttacctaata 16500 aaaatgccat aaatagagaatgatgaacta ccatgggaaa tgaatgcata gaagaggaca 16560 tgctggaatg tgggacagtaaaaatcactt aaactttgcg tgaccttgaa gaaagtcacg 16620 atgatctgtt tttccaggtccctcaaacag tgagatgtgg ctgtttccca agtcttcctc 16680 tccagtgtaa agggtctgaatttagacgct ttgtgagtct tccttctttc gacagcctgg 16740 agtctctctt gagtctcaaggctgcctgag ttcctctcta acatcctcta ggcagtatca 16800 gctaatgaga caatgaattccatggaggca gcagtgggaa cagaagtacc tctcttggat 16860 aatttacaac actggtgagcagagggtcag atcaccctgg ggtttgtgtc acaaccaaaa 16920 aagtggctgt ggcactgagttcttggatgg ttttctacag ctggtccaga ttttccatgg 16980 gctcaccttt aaattaaaagaatttctgca ctttgaagaa tttgaaaaca aagccatgtg 17040 tgagaatatg agatccactcatatgccctt gcaagaaata ggttgcattc ctttttccgg 17100 acttaaaaaa aaagcaccccctctttcttt ttttcagaag gcatatatgt aaatgattcc 17160 aaattaatct ttagcatgtgcctatgttgt tctgatttac taaactttaa aaatatgtcc 17220 attgttgtct gttaacagcttttggcaact ttttcagaga ttgaaatatg tgagcaaatt 17280 agagaaatga gtacaattattagctagtac cattcaacaa gcgctaaaga tacaaatacc 17340 tctacaatac ataaaaggaatgattatagt agattttata atgccatata aggtttctta 17400 tttaacttca ttcttaattctcaaaataaa atgaaattac atagaagcaa agtaatatag 17460 ttaccagaat agtatttttacatgtcttta agtgtatgtt gttgttgttg tttttaaggt 17520 aattatgtga tgttgtggaaagaacagaga cctgggttag ataaaattcc ggttgtctac 17580 cagattgtga tagtgagcaaattacttaac ctctatgatc cttatcttat ttatctatga 17640 aacaggattg gtaatactcatatcataagg ttgaaaggat taaatgaggc actatggaaa 17700 atttctaaca tggtggtgcctgggacagta gaagatgctt aataaagata gctttcatta 17760 ttattattag ctttttcaggtgatggtgat tgtaaatgtt taggtaattt tttaaacttt 17820 agaaataatt gattttcaaatgattaagac tgcttatttt aatcatttat ttttatcacc 17880 agatttattt ttattacccaaaatgtcaac gactgtcata aagataaaaa ttaataataa 17940 ttggccaggt gcggtggttcacgcctgtaa tcccagcact ttgggagctg aggtgggtag 18000 atcacaaggt caggagattgagaccatcct ggctaacgcg gtgaaacccc atctctacta 18060 aaaatacaaa aaattagctgggtgtgttgg cgggcgcctg tagtcccagc tactcatagt 18120 cccagctact caggaggctgaggcaggaga atggtgtgaa cccgggaggc ggagcttgca 18180 gtgagccgag atcgcgccactgcactccag tctgggctac agagccagac ttcatctcaa 18240 aaaaaaaaaa aaattaataataatttaaac ccgaagtatg aactgaatta tttcccttag 18300 tagcacatca cataggctgatgatagtttt ggtgactggt ttatctattc ttcctaaaag 18360 caaactgttg ttagatggatgatcacttgc atgttgtgac tgaactcagc agttgggttt 18420 tattttttat tttttatttgcttcagtagc attadccttt cctaccaaga ttcgaacaat 18480 ccatttgcct ttttttccctaaaatctctc atacattgta aatactacat attggctaaa 18540 tatttcctgg acagacatgaaggacacata aatcagtctc tgtatgatgt ttctcactgt 18600 aatggagttt atctggctcaagaccaggac atttattgca tatcaggttt ctacagttca 18660 ggcaaaagtt tgaggataaggacttactgc aaaaagtctt ctattgttct caaccatttt 18720 ctcgcttagc acatgcagagatttgaaatg gtccgtggta cagtagttgt gtctgtatat 18780 ttctcttgta gaatattagaacaagggatt tgcagtttac agagaagaag gcttggcgag 18840 gtgtttggaa atacactcagaaacctgagg aaatttgtgg aaagagaggc ttattatttc 18900 tagaaatatg ctagagtwcgttttgattgt gcacctgagg aattaataga ttaagtagtt 18960 ttataaggac tggggttaatagaatactgg cagtgaagtt tgtcttagga cttcttaatt 19020 ggataatcag tgaagtcaccagatcccagt tagagacagt tccaagtttt acaaaacgca 19080 agataactgt ccaagagctgtaatggctta atcatctttg aataatacct ctcactgaag 19140 ctatatcata agaaataaaaatctacattt taaaaaattg gctgtaatca tagggtgact 19200 aactgtccct gtttacccaggactcagggt ttcccaggct gagggacaat gggtactaaa 19260 accaggacag tcccaggcaaactgggacgg ttgatcaccc tacccaatgg cctcatctgt 19320 ctcattaaaa tatctggattacttcgtgcc tcaaaaatat cctcggctta cctgactcta 19380 gacagtcaag aagcttttattaattgtcta atgtatgcca ctttctggag gtgatattgt 19440 tcaactgata gatgagcatcactgattgaa atattttgtg gttttcatgc tttgtatctt 19500 gtgctgatag ccccacatggatatttctgt ttccaagttt gtgtcacttc tggagatatt 19560 agcctgaact cagcaaaataggatgatcaa aatgaacctt tccagtgaat tctgtccttc 19620 ttgtgctgtt gtcatctgacttagatatac tggccgggcg cggtggctca cacctgtaat 19680 cccagtactt tgggaggctgaggtggttgg atcccttggg atcaggagtt tgagaccagc 19740 ctggccaata tggtgaatgaagccctgtct ctactaaaaa tacaaaaatt agttgtgcgt 19800 ggtgaagtgt gcctgtaatcccaggtactc aggaggttga ggcaggagaa ctgcttgaac 19860 cagggagtcg gaggttgcagtgagcccaga tcacaccact gcactccagc ctggcaacag 19920 agtgagactc catctcaaaaaaaaaaaaaa ttagctggat gtggtggcac atgcctgtaa 19980 tcccagctac ctggaaggctgaggcaggag aatcgcttga acccaggaga cggaggttgc 20040 agtgggacga gatcgtgccactgcactcca gcctgggtgt cacagcgaga ctccatctca 20100 aaaaataaaa atcaataaaaaataaataaa tacataaata aatgaacaca taaattagat 20160 ataccaagaa aagtataaaaaagtcttgtg tgaacataaa tgaaaattgg ccaaaatagg 20220 taacagacag ggtcaggcgtggtggctcat gcctggaatc ccagtacttt gggaggctga 20280 ggtgggagga ccacttgaggccaggagctc aagaccagct tgggcaacaa agcgagacct 20340 catctctatg aaagaaaaaaaaatttaaaa gacgtaatga acaacttgct tgccttcctg 20400 cctgccttcc ctaaaatactaagttaaatg caatacatgc cctgacattg tagtttgctt 20460 tcacaaagat ttactgaatacttactctag gctaaacctt gtgctacatg ttggggctac 20520 agggatgaaa garaattggtcttgccctcc aggaaccttt catttagtac agagatttag 20580 tgtgtgctgg ttggtctctgttctccccct ctcctccaga tctattctct atttcttccc 20640 ctctccctgc ctccaggaaggggggctgga tcactgtggc tcattgctct gtggcttctg 20700 attgagttca gccaatgggaggcatmattt tggcgtggca gctctggctg ttcctctgca 20760 attgcagttc cctcctccaaggctctggct ctcactgggt tcctgtatcc aataacagac 20820 tcccttaact gcccacttctgaaaacagtt tctgcataaa gctattttca taatttcctc 20880 tgatgtgcct tctgtttcctgtgtagaccc tgattcaata ggaaaataaa ttattgaaat 20940 agaggaagag acaggtaataatagaggtat acacaagtag aatggggcaa taaatggcgc 21000 attttcgcac catcaagagtgcccatgtaa cagagataag taaatgcatc ttgagctgaa 21060 cactgaagga taagaaacaaaggggagaaa gacctagaag gggcaatata cagcaaggag 21120 ggaaaataaa ctactgtgcattcatgccag tgttagcatt taggacatct ggaagctaga 21180 ggtggagtgg aaaaggagagagtgatagga gctggggtca gagagtttca gggtggggaa 21240 ggtcttgcag gaccttgtaggtaattgtaa agcatttgga ttttattctg agggtcactg 21300 gggtgtcatt agagacttttgagcaaagag gtacatgctc tgactgaact ttattctgtg 21360 aacaatcaga atcaactagatggatttaag tatgggtata ccatgaaaga aaattactta 21420 agatccttgc tactcaaagtatgagccagg accagctaca ctggcatmag ctgggaactt 21480 gttagaaatg cagaatcccaagtccccgag acaaactgaa tcagaacctg cactttaaca 21540 agatcccagg tggcccatttgtatggtaga gtttaagaag cattggttta aaagatccct 21600 cttgatagga gcatggaagatacatttgag acagaataga caagtcagag acaggtggga 21660 agggcctaaa acagggcagaagtagggagg taaatgagga gacaaataca aaggaagaaa 21720 atgcacagca cagtgtagacaattcctaaa tacttaaaaa aatttttttt gaaataatga 21780 tagattcaca ggaggttgcaaagaaatgcg tagggaagaa caatgcaccc tttacccagc 21840 ctcctccatc attaacatcttatgcaacta tattataata tcgaaaacaa tcaagtgaca 21900 ttgctacaac ccatagagcttattcagatt tcaccagtta ttagatgcac tcgtgtgtgt 21960 gtatgcatat agctctgtgtaattttatca tatgtgaagc tttgctacca caatcaagat 22020 attcaagcca ttagcagaagattttctggt gttacctcct tatagccaca cgcattcctc 22080 catcattaac ccctgggaacaactaatctg ttcatctcta taattattct atttcacgaa 22140 cattttgtag atgggtacatgcagtgtgta tcttttggga ttggtaacag agcaagacag 22200 gatctcactc tgtcacccaggctggagtgc agtgtcgtga tcttggctca ttgcagcctc 22260 cacctcctgg gctcaggtgatccttccacc ccagcctcct gagtagctgg gactacagac 22320 acacgccacc tcacctggctaattttttgt atttttataa tgatggggtt tcaccatttt 22380 gcctaggcta gtctagaactcctgggctca agtgatccaa ccgccttggc ctcccaaaat 22440 gctgggagta caggcatgagccaccacctc caccagcttt ttcattcata ctttctttga 22500 agttcatcca agttgtgtgtatcaatactt cactccttcc agttgctgag tagtattcca 22560 tggcttggag gtgctagagtttattcatca cattcaaccc attgaaggmc atttgggtgg 22620 cttccaagtt tccagttttgggctattatg aacaaagtta ctatgaacat tcatatacaa 22680 tggatacttt ttgtatgaatgaatggaata gaatggatag gatttagtga tcagctatgt 22740 gggatgaaga gtggcataagtagtaaaaag taaccctcaa tgcaatgtgc agccagcaag 22800 taccacaaaa agagtttattttgtttcata catatatttc tatatataca tacacacact 22860 ttattaataa ccaaatagtatccttttcaa atgaaaacag taatttaaca taaactatga 22920 acttaaaatc taaagtaaaacttgacaaca gtgatgcaga attttttgct ccttagctca 22980 gttaggtctg tgttcttatcttatgaccag gaagaactag gtaccctgac atcaaagaat 23040 gagtggcata gaatttattaagcaaaaagg aaagctctca ggaaagagtg gggtcctgaa 23100 agcaggttgc tggttgccccttcgtagttg aatacaaggg cttctatata aaacctgatg 23160 gggccgagtt ccctgttcgtataaggcatg aattcctggt ggctccaccg ccctccccca 23220 gtgcgtatgt gggaccttcgtccactaggg acatgtttag acaagctccc tgtgcacgtt 23280 cccttatctg cacaaaacatgggttggagg ttctccgggg acccttcctt tactttctgc 23340 ctaaagcaag ctggctaactcctttcaaca atactaaaga catacagaca atggttctca 23400 gtacaatcat tttaaatatttaagtaaact taaaatggtg tttgttttga tttgacattt 23460 taaaagatat cgctgttctaaaaattctgt gtttttagtt gtttgggctc ctattctaca 23520 atgtgctatt actattaagcattcttgtat catggcattc ctcaaatagt ttttaaatta 23580 cttttaattt gaagaaggaacattctgtac agtcacggaa agtgtcaaaa atgaaaatga 23640 ggcagggtgt ggtggctcacgcctgtaatc tccgcacttt gggaggccta ggtgggtgga 23700 ttgcttgagc ctaagaatttgagaccagcc tgggcaatat ggtataaccc tgtgtgtaca 23760 aaaaatacaa aaattagccaggtgtggtgg cccaagcctg tagtcccagc tacttgggaa 23820 gttagggtgg gaaatcctaggtgacagaat gagaccttgt ctcaaaaaaa aaaagaaaaa 23880 agaaaatgat aaaggatacatatcaggaaa acatgcatgg tattttgtat catctacttt 23940 agagtaattc cagtatagtggtttttttgt tgttgtttgt tttatttttg agaaagggtc 24000 ttgcgctgtc acccaggctggagtgcagtg gtacgatctt ggctcactgc aacctccgcc 24060 taccaggttc aagccatcctcccaactcag cctccagagt agctgggact acaggtgtgc 24120 gccaccatgt ccagataattttgtattttt tgtagagatg ggattttgcc atgttgcctg 24180 aatgcctggc ctcaagcaatccaccctcct cagcctccca aagtgctggg attgcaggcg 24240 tgagccacca cacccagccccagtgtagtc gttttttctt ttctttttta ttctatgttt 24300 taatgaattt acacgttacccaaatgttcc ctagtttttc tgccttccaa gatcactctg 24360 gaagaatatt taagaatataccaaataaga atatgcaagt cctcccctaa gggtggcagg 24420 aagaacaccc ctcccccagatggtatttag cgcctctggc tgggaacggc ttccccatgc 24480 tcctaggtca gggtcctctcttggcatgac actaccacca cagtgcagac ccacaacagg 24540 gagaaggacg gccacagtccctcaatcccc cttttccaag atgtgcacag cctgactcct 24600 aactccccac cactgactctaggggaaaaa cagcacaggg caggaaacga ttttccatgt 24660 caccaacctt tctctgagggaacctactgg ccacctccct cttaggacca gcccatcgtc 24720 cacaacgtgg aagtccagcttccgttcaaa tcggagttct ttcttcatga catttctttg 24780 caaagtcccg gaacccacagctctgagact ctggctgtcc cccaacccac cccatcttcc 24840 ttgtcctcac ccctggtcaggagaagccaa aacatcagtc agcttcccag taatcaagcc 24900 tggctttctc acccagggctcgccccagaa caaccaccgg cttctttcag tgtagccaaa 24960 aggctattgg agtcttctcaaatgaaagag attttatcaa aggcttggag aagaaaagaa 25020 aaagaggatt atataataaaacgtaaaaca acaaacatat acacacaaac aaaaataaac 25080 gtgagatatg attctcccggagtgtttaga gcaggaatgt tcttgggcat ctgccttccc 25140 ccaccagcac cccccacaaggcaaggccag ttcaccctca gtgctcacta ctttgcagtg 25200 ttcatagaat atttgtaataattttaggcg gctccctaaa atttcttttc tttttctttc 25260 tttctttaga gttgcgtccctctcggttgc caggctggag ttcagtggca tgttcatagc 25320 tcactgaagc ctcaaattcctgggttcaag tgaccctcct acctcagccc catgaggacc 25380 tgggactaca ggtatgcaccgctatacccg tctatctttt atttatttat ttatttagag 25440 acagagtcta gctctgtcacccaggccaga atgcagtgac acgatctcag ctcactgcaa 25500 cttctgcctc ccagatttaagggtttctct tgcctcagcc tccctactag ctgggattac 25560 aggcttgcac cacctacgtccggctaattt ttgtattttt agtagagatg tggtttcacc 25620 atgttggcca ggcaggtctcgagctcctga cctcaagtga tccacccggc gtggcctccc 25680 aaagtgctgg gattacaggcgtgagccact acgcccagcc tattttattt tataattttg 25740 ttttagacaa ggtctagctctgttgcctgg gctggagtgt agtggtgcaa tcacgattca 25800 gtgcggccct gatctcctgggttcgagtga gccttagcct cctgtttagc tggtactaca 25860 ggtgcatgcc accacctagctaatttttta aaattttttt gtagagacgg ggtctcaccc 25920 tggtgtccag gctggtctcaaactcctggg ctccagtgat gctcccacat tggcgtccca 25980 aagtgctggg attataggagtgaactactg tgcccagtct ttttaaaaaa ttttcaagag 26040 attggggtct tgctatattgcccaggctgg tctccactcc tggtgttaag cgatcctccc 26100 acctcagcct ccttgagtagctgggatgac attacaggca cacactgcca ccactggctc 26160 taaaacttct tctgtgccatttgtgcactt cacccaattg cctctttgta gtaattaatt 26220 aggatctagg gtgaaaaaaaagtcaacagc tatatatagt cctcaaagtt ttgtacgtat 26280 ctgagcagtc atcagttgcacagtgcagag ggatgaactg ccgtcccgcc acctaaaaag 26340 cattagtgac catcagggaaccgtcagatg catgccagac taaagcagag tgaggctgtg 26400 ctgggtgctc tgtctgtggctgcccgtgct ctcacttccc tgtcttgctc tgtgcctttg 26460 ggaggttgac cctgagttggcatctcaggg tctcagtctg ctggtttcct gsgttcccct 26520 tgaaggctac tgctcccacaaggcaaccac ggtccccgct ctggctctca ctgagctcca 26580 gaatcattgt ttcctccccttacccaagtg agaataatta tgttttattc cagaaccctg 26640 acaaatgaag aggcctaaaaaccccctagg tattatccga tcttggtgat cagggaggtg 26700 tttgttttgt tttttaatgcagacacatag ttttaaaaat tattcacttc atctactgta 26760 agaaaagtca tattaattcacaattttgat taaaacaaac aaacaaacaa acaacttctg 26820 tgacattttg gctaacaagtggttcaatat taaagctttg tccaccaggt gcagtggctc 26880 atgcctgtag tctcagtgctttaggaggct gaggtgggag gatcacttga ggccaggagg 26940 tcgaggctgc agtgaaccatgatctcacta ctacactcca gcctgggcaa cagagtgaga 27000 ctctgtctct aacaaacaaacaaacaaata agtatagttc tttcaagcat ggcagacaat 27060 ctgtctcctt tggcctgggtctctcactgc cttttagata aaaatctggc aataaccaaa 27120 gagttttcat aaggcctgttgatctattta taagacatgc atataattta cttgaccatt 27180 ataataccat tataataatctaaatctatt ttctttatcg tccaataatc cacagagtca 27240 gcacacaagg attcttttttccatatatag gctgagtatt ccttatctta catgcgtgac 27300 gccaaagtgt ttcaggttctggatgttttg ggattttgaa atatttgcat atacacaatg 27360 agatatcttg gggatagaacctacatctaa acacaaaatt catttatgtt tcatatacac 27420 cttatacacg tagcctgaaggtaaatttac acaatatttt taataatttt ccacataaaa 27480 caaagtttgt atacattgaaccatcaggaa gcaaggtgtc cctgtctcag ccacccacaa 27540 ggacactctg tagttgtctttcattcctga ttccgaattt atacgctact gacaagcaat 27600 cattttctta cacttattcacacaagagca cttagtaaaa aatatgacat atatatctgg 27660 catgctcaga aaagctattttgcagcagaa aggagctggg agggtccttt ttttcccttg 27720 gggacacgga ataaattgtgtattatgtgc ctgcattttg actgtgaccc catcacatga 27780 ggttaagtgt agaattttccacttgtctct ctgtgcttaa aaagtttaga ttggccaggc 27840 atggtggctc atggctgcaatcccatcact ttaggaggcc aaagcaggtg ggtcatttga 27900 ggtcaggagt caaaaccagcctggccaaca tggtgaaacc ctgtctctac taaaaataaa 27960 aaagttagcc tggcatgttggtgcatgctt gtaatcccag ctactcggga ggccgaggca 28020 ggagaatctc ttgaacctgggaggcagagg ttgcagagag cagagatcac tccattgcac 28080 tccagcctgg gtgacaaagcgagactctgt ctcaaaaaaa aaaaaaaaaa aaggttagat 28140 tttggagcat tttggattttggattttgca ttaagtgtgt tcaagctgaa aagaaaatcc 28200 gatttgctca ggacaaacttaacaaaacaa gtgagatatt ccaatactat atatatgctc 28260 ctgtttatat ttccttaattaatttggact tggaacaact tggccaatta tggattagag 28320 gatgagactt aaatgttactgtacaaggga tagaacgatt cattcctcta tgttatcaaa 28380 tacttatggt attttmcccatcctgctgtc atgcagatcc aagaaccaaa ttaaaacaca 28440 tttgccgggg tcataataatgtggccagaa tttaaagaaa aacttgattt ttaattatgt 28500 atgattttgc ttgtttagtctaccgatttc tatttgcttt agcttactca aaaataaagc 28560 gcggcacttc gaagactcaatagtcttcca ttcatgtggg cctttataat gcacgggccc 28620 agatgcaata catctggcggtctgcttggg ttggccactg gattgaagga ggcagagaag 28680 tctgggatga ttcccaaatgtctggatctg gtgacaggga gatatggcag ggcgagctta 28740 ggggaaaaag ctgggttaggaactgttgaa actgaaatcc ctgaggsytk tgccgacaga 28800 gagacagccg gtagaaggttgtctttgcct gtctgtggtt ccaggtaact tcatcgaaag 28860 agagtttcag gcagtagaaataagagcacc caggacaaag ccccagggaa gagaaacatc 28920 tgacggagga cagaggaagaagggtcagga atgagactga gcaggtgtca tgtgtctgac 28980 accagagcct gacacatagtacgtagtaga cactcagcaa ataccgtaac agagatgaat 29040 ccaaggctgg gggaggtggctcacgcctgt aatccccaca ccttgagagg cctaagtggg 29100 aggatctctt gagtccaggagttcgagacc agcctgggaa acatggtgag accttgcctc 29160 taaaaaaata aaaattaaacattaaaaaaa gagatgaatg cataacctgg ctgctggagc 29220 caacatgggt tgggtgagcccactcttacc agcagctaat caaaaatttg cctggaattc 29280 tgaggctcct gtcctacgtcttggctgctc ctcccagatc accttctggc cggtcccaag 29340 tccacttccc gtgctccttgctcccttcct cctggtctcc ctcacacttt cctttcctac 29400 tccccttccc tctgtggccctggctcagcc cagcacaggg agagccctgt gccacctatt 29460 acagctcacc tgcacctttgcatctttcag aaaggagcac ctacaagata acccaccccc 29520 cacctttttt tttttttttttagtagtaca gattgcctct catagcataa ttgggcttca 29580 ttattatcct taaagaccctctttctgtgg cggattggga tggataaaat aaagaagatc 29640 gagaggttga agaacccatcctgttttgcc agtgagaagg ggatagaatt aaaaggatta 29700 ggagggctca ggcatggtggctccagngtg tcatcccagc tactcaggag gctgaggcgg 29760 gaggatcact tgagcccaggagttggagac tatagagcac tatgattaca cctgtgaata 29820 gccactgcac tctagcctgggcaacatatc aagaccctgt ttctagggac aaaaatatnn 29880 tttaataaat ttaaaaattaagggaaaggt aaccacatcc tgctacaaan aaaagaagnt 29940 ggagaggtan gangaggaccaagagctaat ggcatcattt acacaaaaag agatgcttta 30000 aaatcagttg ctcatccaattccacaagga caataagtaa gaaagaggat agaaagtcac 30060 cggtggattt ggtcatcattggcttcttga tgactttagc aacaaaaatt cttgttggta 30120 gtgagagtta gaccctggtggactgggtag ggggttcctg gatcatgagc aaaggcctgt 30180 gccagccaat ggcccccactacactctgcc ccggcctttc tcatctcaaa aaatggcatc 30240 ccccatccaa agctcaagtcaagaatccag cagccacctt tgattctgca cttcccctca 30300 cctcacagtc cagtcccatctccaaaataa gttccaaaty tcaccacttc tcattctcca 30360 aagaggmacm attatctctttcctggtgat taaaacagct tcctaactgg sttcccttct 30420 accttgcttt cccatagtccattcttctca ggacaacaac agtggccttt taaaaccagt 30480 gcattattgt tgccctttgggaaatcctcc acaattatcc agtcttgctt caaaaaatgt 30540 atgtatttct gactttttaccctgccctac ttacaggata tgcacatttc tgatctccag 30600 ccaatatcac acttcttctctcactgcact ctgccacact tggccaagtt tgttcccact 30660 cctcttgcac ttgctctcagatctcagaag aggcgtgctc cttgtctttc aggccagccg 30720 gcttcacaca tgtgccacgtgcgcccctcg ctcagaaggg atctgtactc ggtttggatc 30780 tattgttgcc atcttgaaactcttaatact ctttgaacac ggggcccgta ttttcatttt 30840 gcactgggtc ctgaaaattgtgtagctggc tctactttca gggattgtat cagaagtctc 30900 ctcctcaaag aggccttcctcggccactta tcctcaagta gctcctcccc ttctaagtta 30960 ctggctatcc catcattcccacttaatttt cttcataaca gttgtcatgc ttttatacat 31020 tctggcttct atatttatttgtgtattgtc cagttccctc cctttggaac gcagcgtggg 31080 cacctgcaac gcagagaccactgtatcccc ggtgcagaat gtaatgagtg cctgatacat 31140 ttgccgaata aactattccaagggttgaac ttgctggaag caagagaagc actattctgg 31200 gtaaaatgga aattttaaatgtacttgata tttatataca tcctaatcaa taattaaatt 31260 tgtgtagtgc tgatctaaacagataaattc tggcttcatg atgatggtga agtggaatat 31320 aattttctca ttttgtattcaaactagatc tttttcatga aaggatttga agtctagatt 31380 caatgcctac ttttgctacttatgttatat gaaactaaaa caattatttt attgtatttt 31440 tttgagatgg agtcttgctctcgttgccca gactggagtg cactgctgcg atctcagctc 31500 actgcaacct ctacctcccaggttcaagcg attctcctgc ctcagcctct cgagtggctg 31560 ggactatagg tgcgtgccaccacacccagc taatttttgt atttttagta aagatgggct 31620 ttcaccatgt tggccaggctggtcttgaac tcctgaccca agtgatctgc ctgcctcggc 31680 ctcccaaagt gttggattacaggcatgagc cactgtgcct ggcaataatt ttagtttagt 31740 ctgaattttt ttttttttttgagatggagt ctcgctctgt tgcccaggct ggagtgcagt 31800 gacgctatct cagctcacagaaacctccgc ctcctaggtt taagcaatcc tcctgtctca 31860 gcctcccgag tagccaagattacaggcacc tgccaccacc cccagctaat ttttgtattt 31920 ttagtagaga tggggtttcaccatgttgac caggctggtc tcaaactcct gacccaagtg 31980 atgtgtctgc ctcagcctcccaaaatgctg ggattacagg cctgagccac tgtgcctggc 32040 ctagtctgaa ttttttaaaaaggttattgg tctaccttcc aatgacattg cactctgtgt 32100 ggctcaataa aacattttcatttataataa ctaatttgac ctgctcagca atctctaagc 32160 aagatagagt agctgtaattcttcatttta caggtcatgt caaatcattt cgtacattcc 32220 agctatgtac gagagcttggtgagaatatg tgaataataa tcacagaact tcagagctgg 32280 gagtaacagc tggaaatatttcttccaata attgcatttt ttatgagagg acgatgaggt 32340 ccaagtggac aggaccatgagacaatcgtg tggcaaggaa gttgatgcaa tttgacctct 32400 taagtcagtg atctttatgtccatcggtcc tttccagcaa gtgagttagc caacctttgc 32460 ctgcaaagga ggaaatttttaattgaggat ttacactctg cttctaaaat tttgcttatt 32520 attgtgaata attttctttaagtttattaa atgaatggct gaataaatgg acataaggaa 32580 agaaggaagg gaggaaggaagggagggagg kaaggaaggg agggaaggaa ggaaggaagg 32640 aaagaaggaa ggaaggaaggaaaaagaaga gaggaaggaa aggaaggata agtctgatga 32700 cagctgctat tatattctacgtggataatt tatttagatc tttatacttt atcttttgtt 32760 ttacttctct tatgcatattctcctcaact ttttttcagt gggccagagg aggaggactg 32820 cctcttgtga ctgtggaaggacttctacca ggctaacacc cctggcctct caccctccca 32880 tttctcaccc tgcaaagcagagtgctattt gattcatgtt cttagtctgt ggatctcagt 32940 tgaggagaac tcgttagagatttgccctct ttctgtcttt ttgagacctt actggtgcaa 33000 gacagcaaat cctagctggtgtctacagga cacatgcact cttaggttac ataactgcag 33060 ggaccactgt cattgtatcctggagctggt tctatataag acacagcctg agcagtatat 33120 aggcttccta gtctgctcctggccaaatgt cccagttgga agcccagagg ttgtctggct 33180 atgccagtgg caggatgggcaagtctaact caagggtgac atattagcaa gacctttatg 33240 gccatgcatc taagatgctctgtccaagcc tgaacttagc aacaataaac ctgacatttt 33300 gaaatccatc tgattcctctattttccagt tgatgccaca tgcatcctct tgccatcttt 33360 cttaattaag atgactttgcttctaaatct ccttaattat caagcagcta tctacaatat 33420 tttgtaatcc ccttaaatcttgagcataat gatgtcataa ttatgaaagt gmccggwttc 33480 acatgaagta ttgcttaatcttaagaacaa aatggcagct gtgaaaacag atgaagtaat 33540 tagaggaaga gcctttttggaagcttcgag atattttcaa agtaattagt actagttagc 33600 aataaagttc tgttctgagaaattgctctt aaaggaggaa catggattaa agaaaaaaat 33660 ctgctactag gaagtaagccatcttcctat gtgtgtgatt ggttttgctt cctgaaaact 33720 ggttccgttt tcaacaaaatttgggtctgt tgaaaaagaa cacgcagatg ccagccttga 33780 tgtcaaacgg gcccaaacttggacagtggt aaactaatga gcaatggtgc acagagtcag 33840 ggtaaaagct ggacaatttcctatgaccaa cttttccagg actctgctct gctcttcctg 33900 agaaaaatac ccaaagtgctgcctcttcca ttggcccaac catgcatctt tcaggatagg 33960 mcacatctgt ttataggtgtggattgtagt tgctcataag tgacattagg ctgtttaaaa 34020 taataatagt tcgagttttgctatgagctg atctgttttc caagagagct aagagttttc 34080 cagctaaaag agggaattagtgggtaatca aggcagctga catggggtgt ggctgggcct 34140 tgaatgtgtg tcactctctgtgcccaggca gagcaaagat aaactccaga ctgcatgttg 34200 ctcagagacc aggaccaacgtcatagggcg cctaaaaggc aggtggccca gttcagaatt 34260 gtcaaggtct gacctgcttggacaagtgct gagtacatag taaggatgga ttggctagtc 34320 tctcaaaact tgcaaacagggcgcaggtga tcttgagatt tcaggtgccg gagagaccca 34380 tcgtgtagat tccagagttggctatcatga ctaacagctg tctaagttgt ttttaaatga 34440 atcattaagg gctacattttcagttcagct aatcaagtag caaattacgg tgggtctaaa 34500 atacttatct attgcattatgtatatgcta gactttatca ctttagttgg ttatatcgct 34560 tcatatacta acagtcaaaaaatgccaaac gagaaaacaa acaaacaaaa atgccacatg 34620 actgtgtaaa tacacttttcaaactgtttt atctaagagt ttactcactt tcacattgtg 34680 gcttatagta ttttcaatctaagagactaa ttttgcttac ataggaaact acatatttta 34740 aattgaaaat taaaaaaatatttttaaggt tttaatgagt cctatcaaaa cacatttgta 34800 tataggaagg tagcccaaggtcactgttgc caattgtgta cacagcctgc cctmtagtgt 34860 tttcttctaa acagcaccaaattttagatc atagttgtaa atctcaaaat gttgggttaa 34920 taggattaaa cactgtgtcatcaaattgat aggacacagc taaatccctg acacggatga 34980 aaattaaagc agagaaaaacgaaggtcctt ccagaagctg gtggcaactt cactggggag 35040 atattgcaaa gttagtggtaaatacactat attaaaaagt tttgttttgt aaatagagta 35100 atgatagaag aagagttagttgaaatgatg tatgtaaaat gtgataactg cataattact 35160 agtacagttg ctagtttacgactgtattaa aaagacattc caaatgttga tcaaataatg 35220 gaggtttctg tggttgttttctttttaaaa tagtaaatat acgtaaagca gataaatatc 35280 ccctttgtgg gagttaaaataatctaactt attttatagt tttaacttta ttaaagcata 35340 cgactattct aacttatttaacttttctta gtaaagtttt aacctctgta tttagaatat 35400 ttgtaactaa tgtgtatcgaattaaactca aagggaaatt cattaactga gaagaaaaaa 35460 ttttaactgt gcactattcacatagcataa tgggttttat aaggagtatg agaaaaatgt 35520 gtgtggttgg ttttgctttctttaaaaata atagcgaacc acgtaggtaa aaactcactt 35580 gagaacatag acttttggagggaaatgcca ggtgtggtgg ctcacgcctg taatcccagc 35640 actttgggag gccgaggggggcggatcacc tgaggtcagt agttcgagac cagcctgacc 35700 cacatggaga aactccatctctactaaaaa tacaaaatta accgggcttg gtggcgcatg 35760 cctataatcc cagctacttgggaaggctga ggcaggagaa tcacttgaac ctgggaggtg 35820 gaggttgcgg tgggccgagatcacgccatt gcactccagc ctgggcaaca agagcaaaac 35880 tccgtctcaa aaaaaaaaaaaaaaaaaaag aattttggag ggaaaaaaat ccctctaaca 35940 gattcgaatt aattctgtgtttcgagatgt ttacaaaatg aagcttggac tctgagagga 36000 tgtgatctat cctctccattgcattgagtt tcaagtactt cacatggcgg gcttttttaa 36060 ctgtcgtgaa gtttaaaccaaatagggact agaatttgtt tgttttttta acttacattt 36120 caagcttcct tatgtctcaggcacattagc ataagttgtc taaagtcata aggaaaaatt 36180 gacagaaaaa tgctttggagccccaggtgt tttcaattga tgccaacaga aactaaccaa 36240 atggaagaca tttgatgcgggtttattttt cctttgcagt aacagcggga acatgaagcc 36300 gccactcttg gtgtttattgtgtgtctgct gtggttgaaa gacagtcact gcgcacccac 36360 ttggaaggac aaaactgctatcagtgaaaa cctgaagagt acgtttggtt tcttacctgt 36420 gctgtgtcct gtttgcatgttggttgtcct gctggcgttt atagtgagtc gcagttgaga 36480 gataaccata ttcgctgttttcacggtgaa acgttctcaa ggcgcttaaa ccaggtcatc 36540 ctgacgccaa acatctgggtaaaaatagaa aattccaatc acgtctctgc aggcgttcac 36600 ctttccagat gtttgtatcatgtagataca acttgccagt tttttcactg catttttttg 36660 tatcatccag atggttggtgtcatctcagc acagctctaa tgaacagtga aatacttttc 36720 tagcatttga aaaatttaaaccattagagt aatctgtgca attgttctta aactagtgaa 36780 agaatgggtt ataattacgttgaatctggt tgttctgtgg ccattaactt gcaactttgc 36840 ttggtgatat atactttgggtacttaatat atagaagaac aaattagcta aaatgcagct 36900 gatttggggt ctgtaataatcagagtcaag aatgagctcc tcagtaggcc acgttggcta 36960 ttttgaacag ggaatgacaatgaattttaa acttactaag ggcttattaa aggtgtataa 37020 gacacgtcca ttgagttattaaggaagctc gtattacatg ggatactttc taggtctcgt 37080 gcctccttat taggtaactgaagctgaaag aaagagaaat tgctgactgt gtttgaggtc 37140 cccagctggg cacttaatataaattatgaa gaaaatgcaa aattttctct aatataaaca 37200 cacttgagtc ttaaatgaaagaaaaaaaat ggataaatga aaacagggcc tgagcaagtg 37260 acaagaatga ggttcagtgaactctatttg tttaggcgct cacaagtgag gagtagaagg 37320 tatggtccgt gtggcagctgtgtccatgtg gcagctgaca gctaattcat tatgatctgc 37380 tttcagaata tgagcctataagagaacaat taagcctctc ttttggagac atgaaaggtt 37440 ggtgaacttg gtgttttgtaatctgatcag atctcaaaga aaaaattgcc acatgtcttt 37500 taggtttttc tgaggtgggggagatagatg cagatgaaga ggtgaagaag gctttgactg 37560 gtattaagca aatgaaaatcatgatggaaa gaaaagagaa ggaacacacc aatctaatga 37620 gcaccctgaa gaaatgcagagaagaaaagc aggtacagtc attgaaaata atgtctgttc 37680 ttacacagat ctggaccagaaatactgcac ttgttagtgc gattgatgaa ttacttattt 37740 tccttagtaa taaatttcatgggtagctgc ttttatttga ggaaaagttt aagggaagct 37800 tcagatttcc ttgaagaacatatttcgtgt aggataggct tctgcaagac tccaacccgg 37860 aatctggggg attcatctctgtttaagtgc tgctttctca aaaatagatt attcttggtc 37920 tcttctgagt taggatattgagtcaaaagt atttgaagag tttttttttt tactagatca 37980 gtggtctcca gagtttttgttttttgtttt ttgtttgttt ctgtttttga gacagagtct 38040 cgctctgtca cccaggctggagttgatccc gctcattgca acctccacct cctgggttca 38100 ggtgattctc ctgtctcagcctccctagta gctgggatta caggctccta ccaccacgcc 38160 tggctaattt ttgtatttttagaagagacg gggtttcacc atgctggcca ggctggtccc 38220 gaactctggg gctcaagtgatccacctgcc tcagcctccc aaagtgctgg aattacaggc 38280 atggaccacc gtgcctggcccagagatttt tggtctctca ttcctatgac taaaaaattt 38340 gttaccactc actcctaaatatatgcatat tcatttactc atgaattaga tacatgaatt 38400 gctaccattg atatctcaaggcacaatatg tatttaaggt gagattcatc attagcgagt 38460 gtggatataa gtccacatttcaaataatct tctagatatt ttgaaacttt tagccgactt 38520 gccagatctg attagatcaccatagttttc ccttgtcact tggccaataa agagctcata 38580 atgatcaagt gtcagctctgccatttgctt ttggtccgct tgagcttaaa ttattcattt 38640 ttaaaatctg ccaagtttttttttttttca aagaatcttg ttaagcctcc tgtccattta 38700 gtgaaggtta ctttagttaaaactagataa taaaatccat cagtctacct gagttctctt 38760 acatggcaac tcattacaattgggtgcatg tgaacagagc aagggaacta tagttgattc 38820 ttctggaatg tagaggatccccttttcccc aaggtcatca catacagttg ggcacacaca 38880 gtatctgaca tatgcatctcaagagagtac catgtatatc caataatgca tcagcctaat 38940 cactttttca aattcaaatagctttattta acagctatag cttgaactac atattttatc 39000 catggagaat acatattatattcaaatgtc tttggaagat gtaaaaaatt gttcatatgc 39060 cacagtataa agttcagtaaatttctaaat tatagacatt gaatagcttg cagtttaatg 39120 acattaataa ttaacatcacactcaaaaca atgacttttt taaaaaaggt tatcttcaam 39180 cattmccctt aaatcaaagaggaaattaaa actgtaacaa aaataatttg gaaaatattt 39240 tcaattttaa tgttgagagtaaattacttt ttaaatktat ttttattttt tgaaaaatgt 39300 taagttgtaa aatacatataacaaaattta ccatcataac catttttaag tgtaacgttc 39360 agtagtgtta aatacattcatactgttgtg caaccaatct ccagaattat tttcatcttg 39420 caaaaactga aagtctatacatattaaaca atgccccatt ccccccaccc cagtcagatt 39480 tttaatttaa aaatacaagtggaagttcta atattttcta tctatccctc tatctataaa 39540 gttgggggcc actgaattccagattgctgc ttgcatcttt ttacttctga gcatcatggc 39600 ctctgggagt ccgttaagcaactggagccg ggtagtgtga caggctgacc ccaaagctgt 39660 gtgtcagcgt caccggactggttgatgttg cagcctcacc tactgccctg agtcagtcag 39720 ggttctggca aggaaaggagaatgcctgac cagcagctgc aaacccttct cccttttggc 39780 agcaatcaaa agattttgaggaaatctaaa atagctcctc atcaggaaaa tgtggaagcc 39840 cctccagctg ggatcttccctggtgggctt gtgagcctgg ccatctggga atagagacac 39900 tagatagcac tcatacactcttcacaaaac acattatcac atggaatgtt ttgaacatct 39960 gggtaaacca ctactttcattttatagcta agaaaactgg ggtttgagat gtttgttaat 40020 taacatgtta ctccaacactgtaatgaatg aactgagata aagtcagcag atgtgtgcac 40080 gggggaccca gtgattttctgcttttctca cttccctgaa cctcctggca aggaggacag 40140 ggtatacagc tttaacaagaatattccact ttgggtgggt caagtaagca aatgtggatt 40200 tcacttctgg ccctgaagaatccaagcaac tagtagaatt tttgtttatt cttaaaaatc 40260 ttattgtaca aaaattcattgaattatact cttaagtttg aggcactcaa ttagaaagtt 40320 aatcggaaaa aaaaaatctgtttaaccctg agtatccctc cctaaaatta cttaaagcct 40380 agaataaagg tcagtttagacaaattatga attggcaaat atggtgttag caaccctagt 40440 ctcccagtat tgagccccacccattctcaa gagtactgct cagtggtgac ccagcatcct 40500 cactgtcccc ttcctccacccctccttatt aatatttagt gagactatct gaaacttatt 40560 aagtaggaaa ccctagagaaggttagagtg acttgacctc caaatcaggt tttatttgta 40620 tgtgttttta atgaaatggggtcttgctat gttgctcagg ctggtcttga actcctgggc 40680 tcaagggatc ctcctgcctcacttcccgag tagctgggat cacaggcact agccaccatg 40740 cctggctcaa tgccaggttaatatagcgct tttgataaac tgtcaactat aggaatagag 40800 ttataagcgt gaatctgccagttggtacaa tgtctagcag gaaacggaag gcgtcgatag 40860 gatattcctt aggaatgtttactagacaga ggtctacttc ttccatggca atgtttcact 40920 tccaaaactt gggacctgtgatttggtaac tgttttttgt cctgcttctg ggcagtgaat 40980 ggaaggaagc ctgagagatactagttatta tactggacta gttataataa cagatgtctt 41040 gcctatgata atggatactaggtataataa tagatgcctt gcttgtttag ctcatttaat 41100 gcaaagacct tgagaagtagatactattat tcctattatt cttatttgca aatgaggaga 41160 ctaaggctta tatgtattaagtaatttgcc caagggtaca cagccactgt agtttggaat 41220 tgggaatatt aggattttggcttatgagga caatgagcag aatatgtaaa attgggactg 41280 attgagaaaa tcctggaggtattgttactt gccttggaga aacaactttt tttttttttt 41340 tttgagacag agtcttactcttgttgccca ggctaaagga caatggcacg atcttggctc 41400 actgcaacct ccgcctcctgggttcaagcg attctcctgc ttcagcctct gaagtggctg 41460 ggattacagg cacccaccatcatgaccagc taatttttgt atttctagca gagacagggt 41520 tttactatgt tggccaggctgttctcaaac tcctgacatc aggtgatcca cccgcctcca 41580 gcctcccaaa atgctggaattacagtgttg agccactgca ccctgccgaa aaacaaccac 41640 tttaagatgt tagattccagccaagtgaag tggctcatgc ctgcaatccc aagcactttg 41700 ggaggtcaac ctgggcagatcacttgaggc caggagttcg agntcagcct gggnaaantg 41760 gtgnaactcc gtctctantanaacatacaa aaattngccc ggcatggtgg cacgcacctg 41820 tactcccagc tactggggaggctgaggcag gagaatctct taaacctggg agatggaggt 41880 tgcagtgagc tgagattgcaccactgcact ccagcctggg cgacacagcc agactctgtc 41940 tcaaaaaaaa aaaaaaaaaaaaaagatgta agattccaaa attgttctac aaagtscaag 42000 gacacacaca cactcctgtctgggtcaaaa tgtatattgg caagctgggg ccctggcagt 42060 tttcttacgt ggatcatagcaaatgctacg tggcttagca gccaaacttt acaatgagga 42120 caackgacaa atcctagccaggcagagaag atgtggaaga ttgtcagtgc ccaggtgatt 42180 ctttgggctt aatactccaggaaagggtca tttccattag ctctgaggct gtcttcttat 42240 ggccagatcc actatactcacttcattccc ctgcacgata tctcggcatg gagggggctg 42300 gggttcagaa gtccacacttgcagggaagc cagaggtttg ggcaggggca caggaagaaa 42360 ggtctgttgc accatggtgctgacccgtga ggcactccag gggcagggct gaggctcgca 42420 gggacaggtg ccactgctgctgggctcctc accacccaga gcaggacttg gccaagtaca 42480 gcaagcacca caagggggagcactgggaat ataaacaaga agaacaaagc ttgtttatat 42540 tcccatttat atttatttaatattacatta tatataaata tatttattat attacattct 42600 aattgcagag atgccatcctgcgtctcggc aattacaatg taactcaacg ggaacattta 42660 acttgacata caagaattgtactttcttgc aatgtttaag gatatacaac aattaaagac 42720 agcataaatg aaagaattaaaatgtaccag ctttataaac tgtaaagccc actttcccca 42780 tgcaccagtg gatgagaattgaagacagac ttaccggtaa ataggtaaat cacagttgtt 42840 cccagatcgg gatggcatcttcattgtcag gtcacccaca cctagagtaa tgtctgtcac 42900 atagcaaaca ctcagtaaatacttagtgaa caaatgaatg aacagatgaa taagatttac 42960 agtcttcaat aggaatcaatcagtgctctt ttcttaaact aaacagaaag ctttggggag 43020 atctgacagc tgcgaggcacctgaaggaga aagaatgaaa aagcagttta gaatgtgtac 43080 atttcaaagg gtgaaatcaactaaggtgca catagatcat gaaatggaaa ttggactttt 43140 gtttctactt ttaactaggaggccctgaaa cttctgaatg aagttcaaga acatctggag 43200 gaagaagaaa ggctatgccgggagtctttg gcagattcct ggggtgaatg caggtcttgc 43260 ctggaaaata actgcatgagaatttataca acctgccaac ctagctggtc ctctgtgaaa 43320 aataaggtaa gagaaaaagagagctcaaga tttcacagtt cttaaggcac ctatttcagc 43380 ttactttttt attaatttatgttaatattt agaacggaga tgcctgatct gataggggcc 43440 ttttgctttc tagaatctaatactaatgtt tacataccat cacctgtgta tacgcaattt 43500 ataaggtaga gcaccattcagtggtcactg aatgcatctc ttaaaatatc ctggctttct 43560 gccttgtatt tgttatttgtgaacatgttc ccactagata gtaagctctt tgagggcagg 43620 gatcatatct tatttgtcttcacttatgca ttggtggcat ccagtaaatg tttaccaaat 43680 tgcatttgga atcatagcattgcagtctct gatttcaatc cacattaatt tttccttctg 43740 gaggccaaat atttaaagatactctctgcc tcccaaatct taccttcaac atgcttgcct 43800 ccttatgcat aacacacacacacacacaca cacacacaca cacacacccc ttcatgtccc 43860 cttttgccct acccatgtatgtagactggc atgttttctt ttttgtaccc tttggttatc 43920 ttctgagcag agggatcacagagggtggtg acctgaatag gatgagctct gccccactaa 43980 cggctccaat taagctagatttttctcccc cttcaagaag tgagctgaat acaaaattga 44040 gtggaatttc acgctccatattagagcaca tactaattag ggtatgctcc tggcttggca 44100 atgccatact caattacaaagggagcaact actaagataa tgaatgcgcc aagttaattt 44160 gcctccacta ttaattgcatctgctctatt tttagagcta ctgtcgcctg ctaatacacc 44220 agaatatggt gtaatcagcaccagcaggaa gtcaggagat atggggacca ttcccatctg 44280 ggtcagttgt gtgatcttatgaacatttct tggggcttta aaggtttgtt tttgtggatg 44340 aagagtcaag taaacagaagctggtagagg gagaggcaga caatccaccc aaattctttt 44400 ctttattttt tttcatgagacagggtctgg ctcttttgcc ctggctagag ggcagtggtg 44460 ccatcttggc ttactgcagcctccacctcc tgggttcaag tgattctcct gcctcagcct 44520 cctgagtagc tgggattacaggcgcccacc accacgccta gctaattttt gtatttttag 44580 tagagacagg gttttaccatgttggccagg ctggtgacct caggtgatcc acacaccttg 44640 gcctcccaaa gtgaaaacttgaccttttta ggctattggt gggcaatgta aaccaggaga 44700 aatttcagat cctgtttccataggcaaagg caaagtcagg tataagaggg ttaagaaatt 44760 atcttaaagt taattgcctcatactagctt gcccagaatt attattgatt tgaaatgact 44820 actgtaagtt gactttaaaatttgcaataa gaaatggtcc agggccgggt gcagtggctc 44880 acccctgtta tccctagcactttgggaggc ctaggcatgt ggattmcctg agctcaggag 44940 ttcgagacca gcctgggcaacacggtaaaa ccctgtctgt actaaaatac aaaaaaaatt 45000 agccaggcat ggcggtgtgcaactgtaatc ccagctactc ggaaggctga gacagaagaa 45060 tcacttgaac ccaggaggcggaggttgcag tgagccgaga tggtgccatt gcactccagc 45120 ctgagtgaca gagcaagactccatctcaaa taagaaagaa agaaagaaag agagagagag 45180 agagaaagaa agagaaagaaagaaagaaag aaagaaagaa agaaagaaag aaagaaagaa 45240 agaaagaaag aaagraagraagaaagaaag aaagaaagaa agaaagaaag agagaaaaga 45300 agaaagagaa agaaagaaaagaaaaagaga aagaaagagt tgagaaagaa aataattttt 45360 tattccattt ctgtcccctactctactcca cagattgaac ggtttttcag gaagatatat 45420 caatttctat ttcctttccatgaagataat gaaaaagatc tccccatcag tgaaaagctc 45480 attgaggaag atgcacaattgacccaaatg gaggatgtgt ymagccagtt gactgtggat 45540 gtgaattctc tctttaacaggagttttaac gtcttcagac agatgcagca agagtttgac 45600 cagacttttc aatcacatttcatatcagat acagacctaa ctgagcctta cttttttcca 45660 gctttctcta aagagccgatgacaaaagca gatcttgagc aatgttggga cattcccaac 45720 ttcttccagc tgttttgtaatttcagtgtc tctatttatg aaagtgtcag tgaaacaatt 45780 actaagatgc tgaaggcaatagaagattta ccaaaacaag acaaaggcaa gtattaaaag 45840 attactttta cttagaggtttacactaaag tcaagttttg tttagcttca gaaatggtag 45900 acatttctga gtcacattgtatagcgtttc ttgaagagac aatttatgga aaatgtttca 45960 gagcctctta aaagaagctttgaagtctgc taaacactat ccctcttcca tcatcgttga 46020 gaactgaact ctttctagagcaaattttca aagcagaaag aaaaaatgct aataggttga 46080 gaacttgaaa aaaaaaaaccagttccctca tttattattt ctttatttat tttattttgt 46140 gacggagtct cactctgccacccagcctgg agtacagtgg tgtgatcttg gctcactgca 46200 acctctgcct cccaggttcaagcaattctc ctgcctcagc ctcccaagta gctgggacta 46260 cagttgtgca ccaccacgcccagctaattt ttttgtattt ttagtagaga cgggggtgtc 46320 agtatcttgg ccaagctggtctcaaactcc cgacctcagg tgatccaccc gccttggcct 46380 cccaaagtgc tgggattgcaggcgtgagcc accatgcatg gccatttccc tcatttatta 46440 aagctcatgt agatgctcagctctattctg ctaaagcatc agagagcttc tttaaaattg 46500 atctggaatc ctcaactcccagtttgagaa gcccactctc acatataacc agagcaattt 46560 agtgccctcc tctgaatcactacaatcatt ccttaaatca taaaatgtat gcataaaacc 46620 acaaaaaatg ctcataaaccccaaactaca gaaatattag ataagaattg ccttctacca 46680 acactaatca tgcctcatggcatccatgtt ggagacacaa tgctgcttta tgttttaagg 46740 cggcagatat cttctgtgggcttctatgga gtaagttaga taccgcattc gagaatgaga 46800 attgccacga gggtcaagtgtaggatctgc atttcctttg tcactgtatt gacccctaag 46860 ccaggttgaa ggctgctcccctctgagatg aaaaataaaa tgggctcctt ctatctattt 46920 ttctttttct tttttctttttttttttttt tgagatggag tgttgctctg ttgtccaggc 46980 tgtattggtg tgatctcggctcactgcaac ctctgcctcc tgggttcaat caattctcct 47040 gcctcagctt cccgagtagtggggattaca ggtgcccgcc accacgcctg gctaattttt 47100 gtatttttag tagagacaggggtttcacca tgttggccag gctggtctcg aactcctgac 47160 ctcaagtgat ctgctcaccttggcttccca aagtgctgga attacaagca tgagccacca 47220 cacccagcca gccaccacacccagccagcc accactcctg accctatctg actatttttc 47280 aattatatta gctgtagctggcaacatctg aatcagattc tcaaaatcgc catgacatta 47340 cataactggc ctctacataggagaggttta cctttcagaa actgaagcta ggaaacagtg 47400 cattacatcc ttcaggtgccatcgttccat gaacagagaa cagccatcat tactggaatt 47460 gttgggttct atttcagagtccagtggact ttttttataa gtcaattatt tggtctggta 47520 gtccattctg aggttgcaaattcatcaaat attcaggata aacaccaggc gagtagacta 47580 aatctatcca ggctgggtggtattaagtga ttttagcctg actgtttaca tggatatcaa 47640 ctgtcttgga ataacactgagaatatgttc attagaacaa aagggctcct cccctccatg 47700 ttgtgtagca gccttacacaagcattggtt acattcccat gtgcacagga ctgtcagtag 47760 tgattcagac atgccacaatctagataatt tttcaaccac tgtaaccccc tcccacacac 47820 cagctacgaa cataggtttccactgtctgc caccattgcc ttctcattca cacagctggg 47880 ggccagccct actctcagctgcctcacacg caccctcccc agcccctctg cgccacttcc 47940 atctcagtga tgacctggaaagccaaggtc ccctgtgaat gcaaatagta aagacaaaaa 48000 caaaatagca accaaaaaagtctgtgttac actattgtac tcttctttct ccagtatccc 48060 tcccctagcc agacagtacacagaagctac cgcagaggag acactgtctt cccagatgag 48120 caaatgtgga ctgtttatcaagaatagtca ggcaggcgct ctacagcact tgaatgtggt 48180 ttccatcact tttctggacaggtagttggt gaggaataag cctactgccc ctagaaaatc 48240 tgcctaatga cttgacactttgagttttgc cccttgtggt aggcaaaata atgactgccc 48300 acaatatccc caccctaatccccagaacct gtgaatttat gttatgcggc aaagggaaag 48360 taaggatgca gatggaaatcaatttgttaa tcaactgact ttatttttat ttatttattt 48420 tttgagacag agtcttgctctgtcacccag gctggagtgc agtggaatga tcttggctca 48480 ctgcaacctc cacctccggggtttaagtga ttctcctgcc tcagcctccc gagtagctgt 48540 gattacaggc actcaccaccatacctggct aatttttgta ttttgagtag aggcagggtt 48600 tcgccatgtt gtccaggctggtctcgaact tctgacctca aatgatccgc ccacctcggc 48660 ctcccaaagt gctgggattacaggcatgag ccatcatgcc cggcctcaac tgattttaaa 48720 atagagagag tatgctggattatccagatg gattcaatgt aatcacaggg tccttaaaag 48780 tggaagaagg aggcagaaaagaattaatag tagcagccac aagagaagga cttggctcga 48840 ctttgacgac cttgaagacagaggaagggg ccaggagccg agtaatgtag gtggcctcga 48900 ggaactggaa atggtatagaaatgaattct cctctagagc ctccgcaaaa aactagccct 48960 actgacatct tttttttttttttttttttt tttttttttt gagacagagt ctcgctctgt 49020 cttcaggctg gagtgcagtggtgcgatctt ggctcagtac aacctccgcc tcctaggttc 49080 aagcgattct tctgcctcagccacctgagt agctgggact acaggcacgt gccaccacgc 49140 ccagctaatt tttgcatttttttttttttg agacagatga catcttgatt ttagcctagg 49200 gagacccact tcagacttctgacctaaaag accaaacaat aatgaatttg tgctgtttca 49260 agccactgaa tctgtggtagctgtagcaga gctaataata atagtaactg accaacattt 49320 actgagcaag ttccgtgtggcaaccttcat ggatgggcct tattggtcat gattgtttaa 49380 agggccaaaa ttagaaaaatagctaacact gaattatgaa caccaggaaa aggagagcgg 49440 aaataaaaag aatcagaaatatcttgataa ttaatgctat ttttgttgag tataggttca 49500 ttttgttctc atatttctttcctaccttgg tctttctgga cctcagttcc tgaatctgtt 49560 gaaagcgaat aggtccaggaaagtagctct tggaattatc ttcatttgcc ttatgaatcc 49620 ctggaaggaa cagatgagattgagttctac tgtagcttga cccgtgcggg ggccgggaga 49680 cctggttcta atgctgccttagagagtgtt agttaacatt aattttcgcg tgggagaaac 49740 agacaggcag gtgggagagtagatgattta gctcagtgac tgcactggaa gtagctccct 49800 ggaagggttc tgaggttctgtcaaggctag actaagcgag gtgatggatt gtgctgtggc 49860 tgcaggatgg ggaattagtgtcatatgggc ctagaatttg tcatccttgg tgtacatacc 49920 aggtattaat ctagatgctagagataaaat gatgattatg acacagcctc tgacttccag 49980 gagctcagtc cagagaaaggaaaacagatt agtgaacaat tacatcacca tattgtgggt 50040 aaaatggcag aagaaggtatggaagaatga caagattaaa atggcaagac caagtccctt 50100 ccctcaagag gcttacagtctaatggaaaa gataagaaag caaacactac ataaagcagg 50160 aattaattct acactggaaattctcacagg gggctataca gggcaaagaa gagggtccag 50220 gaaagcagct gggagaaactgactttctgg tcaccaaagg ggatgggtgc cttacatgcc 50280 attctatcaa acagtgcttcactgttttta aactatggac tttgcaattt atctcaaaat 50340 aaaacgtttc atttttaaatgctgaggatt taatatgaca gaaaatcatc aggttgtaaa 50400 ttagtaatac atgtttcctaatgtcaaaca ctctattggg aaccgccaat tttctgttgg 50460 atagacttct cttttacacatttttatatg gattgttaat tctcctaggg gaaaaaactt 50520 ctcaaaactt gattggctttagatattttc ctaaatcttt gaccccctgt tcataacagt 50580 atatgcatct ccacacacacatactcgcac acatatgtgt gtatatatat gtgtgtgtgt 50640 gtgtgtgtgt gtatatacatatatatgaga aatgcaaaaa aagaatagta ataaaataac 50700 cacctatcac ccactttaagaaacagacat ttctaatatc tttgaaactt cttcccaatt 50760 atagctttaa aaattaattattaaagagtt ttttaaaata cagaaaagtc caagagaaaa 50820 agtggttcac aatcacctatttacttaatc ctattgacat cagaaatact aatgatataa 50880 gacaaatgat ttttaaagtaatcaaatata taaaagaaca aaataaatga aagctgccct 50940 ctcctacctt atcaactccctcttctaaaa gatagttatt aataattctt catgactcct 51000 cctagaaaat aaaattacatgcattaatat atgtgtgtat atactactaa taaatttcta 51060 gtaatgagat tcttggattcaagagtgtgc aatttttaat agctgttcag ttgtcccagg 51120 aaattattgc accaacgtgcatttctgtgt ctaaatatag gaaaaagggc caggggcggt 51180 ggctcatgcc tgtaatcccagcactttggg aggccgaggc gggtggatca tttgaggtca 51240 ggagttcaag aaaccggcctggacaacatg gcgcaacccc atctctacta aaagtacaaa 51300 gattagctgg gcttggtggctctcacctgt aatcccagct acttgggagc ctgaggcagg 51360 agaatcactt gaacccgggaggcagaggtt gcagtgagcc aagatcccgc cactgcactt 51420 tagcctgggc aacaagcaagactctgtctc aaaaataaat aaattaaata catacataca 51480 tataggaaaa agattttgaaagcactggta agaaaaagct gcggcattgt ctccacttct 51540 tcaaagtgca aactcttatgacactaacgt gtaaatgtta tgttccctgt agctcctgac 51600 cacggaggcc tgatttcaaagatgttacct gggcaggaca gaggactgtg tggggaactt 51660 gaccagaatt tgtcaagatgtttcaaattt catgaaaaat gccaaaaatg tcaggctcac 51720 ctatctgaag gtaaataattgctattttgt tttttattct actttaagtt ctcaggtaca 51780 ttttgttata aagtttcggtgccacaaaag aaatagcact cgaatataaa attttctttt 51840 taattctcag caaggaaagttacttctata gaagggtgcg cccttacaga tggagcaatg 51900 gtgagcgtgc acttgccaagggaggggaag gggttcttaa ccctgacaat gcacgtggcc 51960 cctgctgctg tgtggttcccctattggcta gggttagacc gcacaggcta gactaattcc 52020 cattggctaa tttaaagagagtgacgaggt gagtggtctg gagggaaaaa tggttatgac 52080 agagcatgta atcggaatgaatcagggcgg agcgtgtaat cggaatgaat cagggcggag 52140 catgtaatcg gaatgaatcagggtggagcg tgtaatcgaa aaaggttgct ttacgaggaa 52200 attaagttta aaagtagaaggcaaagaatt gaacatactg acatactgat tctttggaaa 52260 gaaatttaga actcacatctaacaattttt tagggtttct ttagtattct ggacagagga 52320 caaaatctca ttctcacaagcatagtggat tcatttgctt tcctccaagc acttttttgc 52380 aggctcattt ccatctgggggcgttcaatg taggtttata aactggtgtt ttgtttgttt 52440 gttttatgag acagagtcttgctctgttgc ccaggctggn gtggcacaat ctcggctcac 52500 tgcaacctcc acctctcgggttcaagcaat tctcctgcct cagcctgcca agtagctggg 52560 attacaggca tgtgccaccacgcccggcta attttttttg tatttttagt agggacgagg 52620 gtttcaccat attggccaggctggtctcga actcctgacc ttgtgatccg cccacctcgg 52680 cctcccaaag tgctgggattacaggcatga accaccgtgc ctggcctggt ttataaactt 52740 ttattattcc aaagtatgtcattctttcac tttctttaat tccctaattg ttcttgtgat 52800 tttttttatg attaatgaccaaacactatt gtgtgcaaaa gaaaaacctt gagcaaatta 52860 gcgcaactcc ttccttcttaccgcaagcaa aaagaacccc tgcccccaac catgaaagaa 52920 acctttcatt ctgtaaatcagtgtttagac aagtgaaata tttttttgaa agtggcattg 52980 gctctttccc attggtgggttaatgaacta attagcattt aaatagggaa agtggcttct 53040 cctcccaagc cccaggaatccttttccctc cctttctagt tccttcccca ggaaggaaat 53100 cattctccct ttcctccatccctcccctca ttccctttcc cttctccaga ctaaagtcac 53160 tcctccaacc ccaccagggccaaattacaa cttttcttac ataaaacaag agcttttgat 53220 tcctatgctt ctgcattttatctcactaaa gccctaaggg aaggaaattt tcaaagtgtg 53280 actaatggct tacagtaggaaattggaaga tacagaaggg acagaaatca acatgtcagt 53340 aaattctaca acactagctagagatttggg gcaagtcatt tatgctgtct aggcctcagt 53400 tgagtaattt gtaaataaaggacccaagat aatctttggg ttctaacaaa attcttctgt 53460 aaaacagtgg tccccagccttctggcacca gggactagat tcctggaaga caatttttcc 53520 aaagatggtg gggcagggggcacgtttggg gatgatcatc aggcattatt ctcctaagga 53580 gcgctcaacc tagaccctttgcatgcacag ttcacaatag ggtttgtgct cccgtgagaa 53640 tggaatgcct ccgctgatctgacagcaggc ggggctcagg cagtcatgct tgctcacctg 53700 ccgctcacct cctgctgtacagctccgttc ctaagaggct acaggctgat atgggtccgt 53760 ggcccagggg ttggggacccctgctataaa ggaagttcag aaaaatcaga ttataattct 53820 gatttttata aatcagaatttataaaattc agattataat ttactaccaa gtaatagctc 53880 ttttgccctt aacttcccacagtgaagacc actggagtaa tttatatcaa cgcaaagaac 53940 aaaaagcatg gtcagtggaaactcctgccc ctcccttggc tttctctcct caatctaaca 54000 gtgagcaagt tgcaacaaatcgcgccgttc agagaaaagg gaggatggaa ttgttacaac 54060 cgtttctgtc gcccaggctggagtgcagtg gcgcgatctt cgctcactga aacctctacc 54120 tcctgagttc aagcgattctgctgcctcag cctcctgagt agctgggatt acaggcacgc 54180 gccaccatac ctggctgatttttgtatttt tagtagagat ggggtttcac catattggcc 54240 aggctggtct cgaactcctgacctcgkgat cctcccacct cagcctccca aagcgctggg 54300 attacaggtg tgagccatcgcgcctggcca acaaattgtt acaatgttaa acaacataat 54360 atcctaaaca tattggcttttaaagtatca ttagatacac cacaatacta ataaaggtta 54420 cctttgggtt ttaagattaaagatgatttt taaaaatact tctttctgta ttttccaaac 54480 tcttaaccat aaacataagatattccttga cttaggatag gattatgtca caacccatca 54540 taagtttgaa aaatcataagttgaaccatt gtaaattggg gaccatatgt acatgtatgc 54600 atatatgata ttaaaaattattagacgtct ttaaaatttg actttttaac atattacttt 54660 tatttaatca ccttgctcaaggagcctgta aattacatat taatattctc cattatgaaa 54720 taagtctttc cattgtgcaaattaatgcat tgcagaggtt ctaaacatct atatgctttg 54780 caactcgaaa ggagtaagtttccctttcta atttttttat tcaattaaat aaaaaaatga 54840 gtttaataga gtctattaaattagatcatt attcggagtg gttagtaaac ctgtttagag 54900 tcgacaacac tccctttctctctttttttt tttttttttt tttgtgccag agtctcgctc 54960 tgtcgccgag gctggagtgcaatggcacga tctcggctca ctgcaacctc cacttcccag 55020 gttcaagtga ttctcctgcctcagcctctc gagtagctgg gattacaggc aaccgccacc 55080 atgcccagct actttgttgtatttttagta gagatggggt ttcaccatgt tggttaggct 55140 ggtggcgaac tcctgacctcaagtgatttg cctgcctctg cctcccaaag tgctgggatt 55200 acaggcgtga gccaccatgcccagcccctt tctccttttt aaatatcacc agcctgggtt 55260 ctttgttctt tttgttttgtttygtttttg tttttgtttt ttttgagacg gagtcttgct 55320 ccgtngccca ggctggagggcagtggcaca atcttggctc agtgcaacct ccgccttctg 55380 ggttcatgcc attctcctgcctcagcctcc tgagtagctg ggactacagg cgcccgccac 55440 catgcccggc taaatttttgtatttttagt agagacgggg tttcaccgtg ttagccagga 55500 tggtctcgat ctcctgaccttgtgatccac ctgcctcggc ctcccaaagt cctgggatta 55560 caggcttgag ccaccatccctggcctccag cctgggttct tattgacact gaattctcaa 55620 gttagttggg ctagtgaggaagtcaggtta cacgggccac agaacaagaa caaggattgt 55680 tctttctctc tctcttccacttcattctct gtcagcctct cccgacctca gtagttggtc 55740 ttttctcccc cttcttttgaaagcagagtc cattatacaa atggacttgt ttacttctcc 55800 acatccctct tgtgcaaattttctgccatg gacacctcta ccccacctta gaatgtatat 55860 tagacaattt tgacatctagaatgtcttgt tgggcagaaa agcgtttgga aagcgttgct 55920 ccaggtagct ctgattacaaactggacctt ttcgcggggt tacctagagc agttgagagt 55980 gctctttctc ctggccaggtgcagttgctc atggctgtaa tcccagcact ctggaaggcc 56040 gaggcgggcg gatcacctgcggtcaggagt ttgagaccag cctggccaac atggcgaaac 56100 cccgttctac taaaaatacaaaaattagcc agatatggtg gtatgaacct gtaatcccag 56160 ctactcagga ggctgaggcaagagaattgc ttgaacctgg gaggcagagg ttgcagtgag 56220 ctgagatcaa gcctccagcctgggcctcag agcgagactc tgtcttgaaa aaataataat 56280 aataataaac agataaataaaatttaaaaa aataaaaaag gagtgctctc tctcctgaac 56340 tgctgactcg aggactctctcagcctgttt tatcatttgg aagaggaaat aatatatctg 56400 cttcgtacac atctttagaagtttaaataa aatgtctgaa atatcaatga ttctcattat 56460 tcaaatattt gttttttaagtcacagttgc aaggttatat acagaagcat aggtttttat 56520 aacagaaaaa tagacacttaatatactgac ctcttacaaa aatagtcctg ctcaagcatc 56580 ccatctatgt atcattamcatctatttctt tctacccagc taaaatagtt tattaataat 56640 ccttgaatgt cacaagtngaatacagaata aatcagataa tacattaaaa tgcacctgat 56700 aatcaatatg caccagataatggacacagt atacatcaga taatacagta caaattcaat 56760 gaaagtttag tgttgcaaaggtaaaatgta aagaatgtcc taatgtgctc ccatgctgct 56820 taaaactgtt attataaattgctttttatt ataaatatat aaagaatgat gtaataggcc 56880 agccatggtg gctcatccctgtaattccag gtctttggga ggctgaggca ggtgaatcac 56940 ttgaggttag gagtttgagaccagcctggc caacatggtg aaaccccgtc tctactaaaa 57000 atataaaaat tagccaggtgtggtggtacg cacctgtagt ctcagctact ccggaggctg 57060 aggcaggaga atcgcttgaaaccagaagcc ggaggttgca gtgggtcaag atcaagcaac 57120 tgcactccag cctaggtgacagagcgagac tttgtctcag gaaaaaaaaa aaattctcag 57180 tcacctagat tgagaaatagaacattacca aaacagataa agccccactg tgttcccatc 57240 cacatcacat tcactttatctcctcaaaag gaaagtgcta ttttgaattt agtattaatt 57300 atttccttgc atttcttcctactcatatca tgtgcctata tacatataat atatacaaat 57360 gccgatatca tacatagcaatgttttacat ttcgattttt gcattgtcaa tgtagaattt 57420 ttaaacttaa aaacatgcttcatacagccg ggtgtggtgg ctcatgcctg taatcccagc 57480 attttgggag gccaaggcaggcggatcgac gaggtcagga gttcgagacc agcctgacca 57540 acatggtgaa accccatctctattaaaaat acaaaaaaaa atattagctg gtcatggtgg 57600 cgcgtgcctg taatcccagctactcaggag gctgaggcag gagaattgtt tgaacccagg 57660 aggcagaggt tgcagtgagccgagatcgca ccattgcact ccagcctggg tgacagagcg 57720 agactccatc tcaaaaaaaaaaaaaaaaag cttcatacaa acatgaaacg ggcacatgtc 57780 tggctgggtg cggtggctcatgcctgtaat cccagcactt tgggaggcca aggcgggcaa 57840 tcacttaagg ccaggagttcgagaccagcc tggtcagcat ggtgaaaccc cgtctctact 57900 aaaactacaa aaattagccaggcatggtgg catgcgcctg tagtcccagc tactcgggag 57960 gctgaggcac aagtatcacttgatcccagg aagcagaggt tgcagtgagc caagattgtg 58020 tcactgcact cctgcctgggtaacagagtg atactctgtc tcaaacaaac aaacaaaaaa 58080 aacaaagaaa agaaaaagaaaaaagaaatg ggcacatgtc aaatgttaat ttgactatgt 58140 aacttattaa tgaaggaaccagcagggtgt tagagctggg tcaaagaagt ataagagaga 58200 ctggagtgct tacagtcaagcagagacaga atgctgaaag gttatgaaat tagatatgtt 58260 agttaatatt cgaaagggcaactaaactgt aaatcttgcc attatctttt ctatcagacc 58320 aaaataattt acatctctactagacaaaca tttgccactt ttcaatccat aatctatggg 58380 taatttcatg gagtctggccctaatcaaca gtaaatagta aagccaacaa aggatctctt 58440 ccctagacct tgaagtgatctttgggtgga ccccttagac aataatttag tatgacattg 58500 agaggacacg caagcctgggcagcatagtg agacccgcct ctacaaaaaa attaaaaatt 58560 agccgggcat ggtggtgtgagcctgtagtc ctagctactc aggaggctaa ggtggaaata 58620 ccacttgagc ccgggagttcgaggctgtag tgagctatga tcatgccatt gcactccagc 58680 ctgggtaaca gagcgagaacctgtcttgaa aaaaaagaaa agaaaaaaga aaaagaaaca 58740 aaaggaaatg cagccattttttttttgcct tatttccaag ttctggataa tttttctttt 58800 ttaacaatat aaatattatcacttatgtat tcttttgcaa tatggctttt cactcagtgt 58860 agtttgcaag gggttagccatgtgaatgca tgctgctcta gttcattaat tcactgttgt 58920 atgttggtct atgtaggcatatcacaatwt atycattccc tagctgaagt acatttgctt 58980 tcaaggtatt gctattataaacaaatctca tacctttaat caaataataa ttttgtctct 59040 tcaatcagct ntgatttactttgttcnaan acnaagcaca caactataat tanaatttca 59100 ttactgataa atataaaatattttccaaaa catcacaaat cttttntnnt ncactattta 59160 ctatacactt tnggtctnaatttaaagcgg cttcactata tgtggttctt ttcctctctt 59220 cccatactaa ttactggtactggacatata catccaaaat caaatagtar tgtccttttt 59280 aagggataaa tgggatgtgatgtagaaggg gcatagtagg gacttcatct gttttggcaa 59340 attttttctt aatataggtggtaggcatgt ggaatttata acaaaagttc tgtctccagc 59400 ccagtttctg ttacataaaaccatataatt aacagttaaa ctggatctgg tttgacacag 59460 atgtagacga tattaataattactccagaa caacaggcat aactaaaaac taccacaggc 59520 aaaaggggaa aatagagaatgtaagggctg ggacttaagc ccatgttgcc cacctccaag 59580 tttcatggac tttttccttctccacattac tttcttctct gctagactgt cctgatgtac 59640 ctgctctgca cacagaattagacgaggcga tcaggttggt caatgtatcc aatcagcagt 59700 atggccagat tctccagatgacccggaagc acttggagga caccgcctat ctggtggaga 59760 agatgagagg gcaatttggctgggtgtctg aactggcaaa ccaggcccca gaaacagaga 59820 tcatctttaa ttcaatacaggtaaaggaga gacccaagag cagatacgga aatgacacgt 59880 gcataccttg atttcactgttaatttactt atgaattgtg tctgaatttg aaaacaagct 59940 gtaggaggta ttcatatttccattgtgatt gccttcaggc tgacttgatt taacgtagtt 60000 catggtcttt agaaaacaagaaagtccata aagaaaatca atttaaaaca caaaatactt 60060 tctaatctag aaatggctatttctgcttag agttataggg ctataactga tagaggtaac 60120 cttgaagaaa tatggccaatgtaggtttta ggagagaaga cttacaaata aagcaatttg 60180 agttcaaaat ttgactctgaaacttaccag ctgagtaagc ttgggaaagt acctcaacca 60240 ttctaggcct cagtgttccacctgtaaaat ggtaacaatc atagctatct taacgtgtac 60300 acctataaag tgattagtatagatttctta tacaaaacaa gagctctgta aattatagct 60360 cttattagtt gctgacacaataaagccact gagttatctt gagaattaaa catttatatg 60420 ttactcgtca cataaaaatacattgccagc tgggcgcagt ggcttatgcc tgtaatccca 60480 gcactttggg aggctgaggtgggtggatca cttgaggtca ggagtttgag accagcctgg 60540 ctaatgtggc gaaaccccgtctctaccaaa aacataaaaa attagccaag tgtgatggca 60600 cacacttgta atcccagctactcaggaggc tgaggcagga gaatcacttg aacccggaag 60660 gcagaggttg cagtgagctgagatcgtgcc actgcactcc agcctgggcg acagaaggag 60720 actctgtctc aaaaaaaacaaaaataaara catattgcca tcttaaattc cacctatacc 60780 atgactccca gattcagtcaataacttttt gcataacatg caagtgactt ttcttcctaa 60840 gacatccccc ctccaacacacacacattac cttaatctac aaatgcgcca ggctagtgat 60900 tcctgatgag gctggttttgagggttccca aaaagacttg gatacaaaaa ttactgggca 60960 gagcaattga agatgcaatattctgtgtgt agtatgttag gttatgttgg tgccctatcc 61020 agatccctgg ggatcccttttaccagctcc cactggtgct ggtgctgctg ctaactgctt 61080 atctctgaaa ctttctcccaaagattgccc ttggagcact tatgccccag agcttcctgc 61140 aggatcaggc tgaggctaacagtcatctga agccatatcc ttgcttagct tctttcactt 61200 ctctagtttg ctttcctcatccccttaaaa gttgcacctg agagcattct ttataaacca 61260 cttctgtcag aatctcaggcactgcttcta ggaaattaga cttatggcat tctataatcc 61320 agcatttccc tcttttttcaaactacaaag ctgtggatca tgcctgattt gagaaataag 61380 tttagaaagt cacagcaagctcattaaaaa acaaaattaa aaaccataca aaaaatagaa 61440 taggacaaag tagaaaatattagcatgcat tgcatttcat aagtcatatg cacatcatgg 61500 aatttcattt ccattttgtatgtgtatatg tgtgtaaaca tatatacaca tatgtagaca 61560 tacgtgtgtg ttttgaatcatgatgtcaag tgtattcatt actgcagacc acagtcaaag 61620 ggttttgaaa gccactgttccaatccctgc cagctctctg attctataac tctattagat 61680 tacacttgag gaaggtaaaataattcaata tatttgatca tcctcgcata tatagacttt 61740 tagtttaacg aggaaaaagtcttgtattga agaataaaac ttgaagaaaa attttagcag 61800 tgctttcaac ctttagaaatctacagtcaa tatttagttg tttttaccat tgtcagtatt 61860 ttctattctg tgctttgatttacttccatt ctagtgtctc ttgagtaaca taacagattt 61920 atctaaaatt ctttatgctgataacaaagg cacttctata taaaaacctc cacataaaat 61980 aaaattatgg ttttcaattatacattttta taacaattat taccacttaa gagcatttac 62040 tgggtgtcag gcaatgttctaagacttttt ccatatatca gatcatttaa taccctcaat 62100 gaccctataa gggaagtagaattctttccc cagtttttca aatgaggcac agaggaggtt 62160 aagcaacttg tctgagctcacacagctagt aaatggtaga actagaattc aaactcaagc 62220 agtatttctc tagaatcagtgaacgtaacc actttgctaa actgcctgtg aagttacttt 62280 tctcaaaaca gctcctatttcaccatgtaa agaaaagtac aaacccataa aatagcaagt 62340 gctgaagaga agccttatgaaagaaatata caaattccag caagtgaaaa cggttgtggt 62400 ccctggttgt ataatagttacatgggtgtt gactttacaa ttatttaaac caaacataaa 62460 tactttatgc agtttttatgtatgttatac tcacagaaag agaagggaaa aatttttaaa 62520 tcattctctt aaggttacatcaagttgcgt atcagttcag ttccatttaa atgattcaaa 62580 tcaaagtctg tgcatttgagaattcattaa gagagtaaca tacatgttat tcattaagag 62640 taacataaat tttgcattgattcttgccaa aatcacacct acaaccataa attgtaaatt 62700 tctaggaaaa ctcagtacaaaacttggtgc aatgcaataa agtttgtggc acagacagta 62760 atactcagca aacatcccacctcctctctc atattttcca gctccccttg tggttaaacg 62820 ttgccatgtg gcaagttctggccagtgaag cgtgagcaaa actgaaaagg gttctttgta 62880 gattgagaca gtgaagagcctatgtgtgct catctattct ctttttctgc tgagggcaca 62940 aagaaagtcc tgaaatcatgtgctacagct atgagataat gtgcctttgc ctaccaggct 63000 tctcagtgtt tactggtgtggagcccttgt aatggacaca taacatgaac aagaaataaa 63060 tctttgttgc atgaagccctaggaatgcca ggactaatct gttacctcag cacaaaccca 63120 ggcctatcct gactaaggtggtattaaatt actattgaat gtgtattggg atttagtaaa 63180 cttctactgt ataatccttcttctgtaggt agttccaagg attcatgaag gaaatatttc 63240 caaacaagat gaaacaatgatgacagactt aagcattctg ccttcctcta atttcacact 63300 caagatccct cttgaagaaagtgctgagag ttctaacttc attggctacg tagtggcaaa 63360 agctctacag cattttaaggaacattttaa aacctggtaa gcagagtgcc tggttaggaa 63420 tgccttgttg acaggaatagttaattctca aaagggaaaa acaaaacttg tttcaaaata 63480 cctggaaaac atgtttaacctcattaataa agacatgaaa acaaacaaga tggcattttc 63540 tgcctatcag atttgcaaattaaaaaaaaa cccaggaaat cctgatagga atgtgatgaa 63600 atgggaattc tcatatatcatgtattggtg ggaacataat tggttttgca tttttgaaag 63660 ctatttgatt atgcatatgaagagccataa aatttccttt tgatataata attccacttc 63720 cgaaatcaat cctaaggrataaatctaaat ttgatgaama ktctccctcc aagatctaga 63780 tttgcagcat tatttaaatattaaaagttg gccgggcgca gtggctcatg cctgtaatcc 63840 cagcactttg ggaggctgaggcgggcggat cacgaggtca ggagattgag accatcctgg 63900 ataacacgga gaaactgcgtctctactaaa ataaaaaaaa ttagccgggc atggtggcgg 63960 gcgcctgtag tcccagctactcgggaggct gaggcaggag aatggcgtga acccgggagg 64020 cagarcttgc agtgagcagagatcgcgcca ctgcactcca gcctgggcga cagagcaaga 64080 ctctgtttaa aaaaaaaaaaaaaaaaaaaa atatatatat atatatatat atatatatat 64140 atatatatat atgttaaacatactcttaat gtgtaaaaac aagagaatga ttaagtakat 64200 tatgactaaa tacactcaatacattttatg aaacgttaaa aatattcaaa aaatttaaat 64260 aatgacttgc taactactttaacaagagct ttattatcag ctagtcttgg aggtaatagt 64320 attatcatga tttttcagaaaaagatcctg aggctcagtg tccaaggtcc aatgaactac 64380 tcaggtcgga ggtggtagagcagcatgtgg agccagttct ctctccgact ccatcatcac 64440 actgcacggc ttcctgttaagatatttgct caaaaaatgc gagatataaa aatctgggta 64500 atatgatcaa ccttaaagaataattacatt ttaaattatt catgagacct tgttagtagg 64560 tcaccatcaa tgtgtaattaagccagatgt gacaggattt gttgcctctc cctttacttc 64620 tgaattttgg aggcctttttttttttctag ttgtatcagt cagccaacca atatcttttt 64680 agcatctact aagtttagatacgggaactg gtactctgaa agagaaaatg agaaatttga 64740 caagatcctg tccccaaggagcttcctatc caacaggggc acaagacaga tagatagaca 64800 cacacacaca cacacacacacacacacaca cacacacaca ctataaagca aggcaagatt 64860 tagagagtgc acaggagtgggctctgggag ttcaggggag ggtcgttcac attctggtag 64920 ggaagatact tctgagctcagtatattccc tttctcactg tccttctatc ccctctcttc 64980 ctctcctcct ctcttttcctctttcttctc cctcctccca ctctgtcctc tccctttctt 65040 tccttttttc tttctttctttttttttttg agacagagtc ttgttctgtc acccatactg 65100 gagtacagtg gcacgatctcggctcactgc aacctcggcc tcccaggttc aaatgattct 65160 tgtgcctcag cctcctgagtagctgggatt acaggcgcac accaccatgc ctggctaatt 65220 tttgtgtttt ttagtagagacagggtttca ccatgttggc caggctggtc ttgaactcct 65280 gacctcaagt aatccacccaccttggcctc ccaaagtgct gggatcacag gcatgagcca 65340 ccacactggc ctcctctccctttcttaaaa atacatcaat taattaaata tataaatgta 65400 gatacacaca caggcagaatcaaagtgtat aggttggaga ggagactgtt ccaaaagggg 65460 ggatggcatg ggcaaatacggcaagaaaga gtagagcatc taggtactga gggtgctggg 65520 aagtcctgct aaaaatacggcaagaaagag tagagcatct aggtactgag ggtgctggga 65580 agtcctgcta aagtggtcccctcccactgt ggggcctttg agtttccctg tgccagggta 65640 cctgccctct gtgagtttgagttctttctt tggttgcaag caaccaagac cagctcagct 65700 aaaagaaatg gatggataccgactcatgag tcagagggga agctggacgt ctatgcccag 65760 agccaggcag aaacgggtcaggtctagagt ctgggaggag gaaaccgatg gacagctgct 65820 tcagggccca gcgctcagggtgaagcagct gcagttgttt ttagtcctca gatcactctg 65880 ctcaagatgt gacttgccaggaggaatctg gctggcccag ctgggacatg tgtgtctacc 65940 tctagaccag gagagaggagagtcttggtt gacagtcccc atgtagtacc cctttgttta 66000 ggttactgag tcatcaacagatctcagttc aaatagtcac ttcttcaggg gcaatatacc 66060 ctcttctacc cataaactaggggcaacata ccctctctcc cctttcacac atgaccataa 66120 caccatgtag cactcaactcttgtaagttg acatttaccc atgtgactct ttatgaacgt 66180 tcatctccat cccgagacctacagtccatg agggtaccac cgttctaggg tttttgctct 66240 tctctttgtc agtggggacttaggactctg cctggcacag ggcaaaccct caatatttgt 66300 tgaataaatt aattaataaacacgtgtaaa tgaatatcag tagactacaa caagagtaac 66360 agtaggcgaa ggtggaaggcaaaggtggga agaggtcagg gctctgagtg ctgggctgtg 66420 gagtctgagg ttcactctacagcgctggtg agacacgata ggttttagag aaaggaagcc 66480 tcatgctggt gccccagtgggtactgacta tgcatttgta gccaaatcaa agtatttccc 66540 ataaagtcat ctatctcttcccagttgttg ggacttccaa tggcaatggg aattaagata 66600 ctgagtaatt gggagatcaagcaaattatt tactaacaag gcacacgaag tgatttttca 66660 caggcaatgt taatgtttttcttttttatg tagttttaaa attctaaaag taacaaaatc 66720 acaactacca aacatttagacgacaaaaat tatccataat cccaccatct taacacaacc 66780 actattatca tttgttttccttattcacat tttctaccta ttttcttaga ttyccaagaa 66840 atagaattac ttgtttagaggttattaaca tcttattgtt ctggatatat atatatatat 66900 agctatatat agctaaatttaataacagca atgtctgcag taccactttc tcaaatgcta 66960 actggcattt caattttttgagacagtctc tctctgttgc ccaggcagga ttgcagtggc 67020 atgatctcgg ctcacggcaacctccacctc ccaggttcaa gcgactctca tgcctcagcc 67080 tcccaagtag ctgggattacaggtgtgcac caccacactt ggctaatttt tgtatttttt 67140 agtagagatg tgtttttaccatgttggccg ggctggtctc aaactcctgg cctcaagtga 67200 tccttccacc tcagcctccccaagggctgg tattacaggc atgagccact gcctggcctg 67260 gcatttcaat ttttaaaatcttcagtaata aatgaaaatt tttatcttat tgttataatt 67320 tttatggttt tttattattcatgagaataa acattttcca agtttgttta ctgactgaat 67380 ttcttttttg tgcaccttacttggtatcat ggataaaatt ttgtcaattt tctgattata 67440 tcaatgcatt cagggtcccaaacctgccaa agtttaaaga gaaagatact aagggaaaaa 67500 ccaggaaaag atggtagaaaagaatcaccc tggcattttc aatcacgtaa acatttgcta 67560 ggtgccctag ctgcaggtatacagctcact gaaacatgaa ttccaatttt atagggtgaa 67620 atatatattt agaaccctcttctggaactt tcttctagtt atctagcatc ctaagtgcct 67680 ggacgttcct gattggtttgcaatgtgttt tatttcccat ccccaagttt catagctgcc 67740 ggccctggga tctacagtcacaggctgtaa cacaatatct tgcacatcct gagtctttaa 67800 taagcttttg tagatgggctcttaccatca tcatcatcgt gaaaggcaaa tatacaaaat 67860 ttgttgacta atgtaatgagtcatgagtaa cagaagttta ctgaccaaac actacgtgca 67920 tgtagagttc agaataaacactttattatc acatcagagg aaaagaccat cttagaggct 67980 caacaaccca ggaaagctgtgacgatttct tcaaattgtt aagaatatcc atgcatatgg 68040 gtttcacatt attttgctacacacagtacc aatttttcca aaagccaaca gcaggtattc 68100 tattacccat cctggacttttactccaaga aaaaatacac tgagtctgtg agtaatttat 68160 tagtattttg atcattgctgcttttttttt ttttttaagg taagaagatc taatgcatcc 68220 tatatccagt aagtagaattatctcttcat ctgggacctg gaaatcctga aataaaaaag 68280 gataatgcaa taaacacagttgcaggaaag tatgttagct atatactatg aagtactctt 68340 agtttactta tgttgaatggcttagctatt aatactcaaa ttgagttaaa atgaaaattc 68400 ctccttaaaa aatcaaacgtaatatgtatt acatttcatg gtacattagt agttctttgt 68460 atattgaata aatactaaatcacctaggtg tctatgttct atcacatcta caaacatgtc 68520 acttcctaat taacaaaatgttcttccttt agtttgcttt tgcacttaaa atatatataa 68580 ttgacttttt tggaaaaaaatctaagattc attgctttgt tttgtaaaga ccaataggtt 68640 ctgtatagtc tttttttaaattgtggtaaa atacacatgg cattaattta ccattttaac 68700 cattttaaag tgcacaatttgtggcattaa gtacactcac gttgctgtgc aaccatcacc 68760 accgtccatc ttcagaacctttttatcttc ctaaactgaa actctgtact cgttaagcac 68820 tcacttcccg tttccccatcccccagcccg tagcaaccac gactgtactt tctatgaatt 68880 tgactactct aggtactgcatgtaggtgga atcatacagt atttgtcttt tgcttcattt 68940 tgttttgttt tttgttttctaagacagggt ctcactctgt cgcccaggct ggagtgcagt 69000 ggtgcaatca cagtgtccttttgtgactgg tttatttcac ttagtgccat gttttcaagg 69060 ttcatccatg ttgttgcatgtctcagaact tcctttttta ggctaatatt cttgcatgta 69120 tttacctagt tttgcttatccattcagcca ttgatggaca cttgggttgc ttccatcttt 69180 tggctattgt gaataatgctgttttgaacg tgggtgtgct acatagttac tttttaaaat 69240 tggcacaaca gcgctgtcttttgacatacg tattttatgg aaaacacaag attttcctgg 69300 ctgacgctca acctcataatttggaccttg gtgcaacaca ataataggag agctatgtgt 69360 cagtatatat cactaaggattacaatgaga gtgtatacag tcagtattac aaattataaa 69420 aagaaatgta ggccaggcacggtgcctcac acctgtaatc ccagcacttt gggaggccaa 69480 cgtgggtgga ttacctgaggacaggagttc gaaaccagcc tggccaacat ggtgaaaacc 69540 tgtatctact aaaaatacaaaaattggcca ggtgtggtgg cgcatgcctg taatcccagc 69600 tactcaggag gctgagatgggagaattgct tgaacctggg aggcagaggt tgcactgagc 69660 caagattgtg ccactgtactccagcctggg caacagagcg agactctttt ttaataaata 69720 aataaataaa taaatatataaaagaaacgt aatgaaagag agagaactct gaacttttaa 69780 agaacttttc acccagtcttgatctatctg acagaaaggc ttgtcagaga aagttagagt 69840 tcagaggcag ccaattgaatataattaact ccaaatgaag ataaaccttt tctaaatcat 69900 actgaaggct ataaaaaatgagaattatgt tatttttttt ttgagacagg gtcttactct 69960 attgcccagg ctggagtgcagtggcatgat ctgggctcac tgaagcctga cctccttggc 70020 tcaggtgatc ctcccacctcagcctcctga gtagctggga ctacaggtac taccatgccc 70080 gtctattttt gtatttttttagtagagatg gggtttctcc atgttgtcca ggctggtctc 70140 aaactcccag gctcaagcaatctgcccgcc tcagcctcca aaagtgctgt aattacaggc 70200 atgagccact gctcctggcagggaactaat agaatcctgg gttcttcggt gtgcaataaa 70260 yctcaaatac agctattcaaccatagattt taaatatttg ttagtgaagg tgacaaaaaa 70320 ataagtgatt aagagaacctattttctatc caatgagcta tcaaaagctt atagagtgga 70380 aagagagtgg gggaagtgaggctcaaaaca gctaaatgga aagaagattt tgcatgcagg 70440 ctgaactgga ttttcatcctggctactata ttctccagat gtgtcacttt ggccaagatc 70500 cttaatctca gtgtcatctataaggtaatt aaagtacact agtgccccac taatctgtgg 70560 ttttgctttc caagctttcagttacccgag atcaactgcg gttttaaaat attatgtgga 70620 aaattccaga aatacatagtaagttttcaa ttgcatgcca ttaaatctca tgctgtccct 70680 gaccccttcc tctccggaggtgaatgctcc ctttgtccag tggctccacg atgactacat 70740 tccccaaatt gttctcttaggaaccctttc tgtgttcaag gaacccttac tttacttaat 70800 tatggcccca aagcacaagatagggatgcc ggcatactgt tataattgtt ctattttatt 70860 attagttatt gttgttcatctctacctgtg actaatttat gaattcaact ttatcatagg 70920 tatgtaggta taggaaaaaaacatggtatg tataaggttc agtactatct gcagtttcag 70980 acatcccctt ggggtcttggaacatatccc ccgtggataa sgggaaacta ctgtaaaagt 71040 ttgtstttta tagagtagttstsagaacta cattaatcca taatgtgtgs ctcatgatac 71100 tcattgatag atggtagtagcaacaataaa aaataatatt atcaagtaac tgattcataa 71160 ttgactctca aaaacgttaattttctgctt tcctttacct aagtttacct acatgtttga 71220 atttgtaaag ggaaggtttttctagaccaa taattttcaa atatttttgc tctcatactt 71280 cctcaaagga aactgaaaaagttgcaacat acttgcatgt catttttcta tataagttga 71340 aagaatagca aattgttattttcccacgca tcgtaaagat tagcaggtca tccctcttta 71400 aaatgtacca aatggaatctaaatatcatc gcaatttgac ccagcatcat ccatttaaac 71460 aaatatacaa gtttttctttaacaatgaga aattttatct cattacattt tctccctaaa 71520 ctcttatttc aatctacattcctaagaatt ttatcctaat gtagtatatt tttatgctta 71580 aatatctttt gttgatcaacacaattttga tcatttttaa attttaaaaa ttaagaacat 71640 cctgtgacat caaattctaggtatgaaata tttattctag attgggtgat cattataatt 71700 attttttgta cataattgatcaaaataaca taaatatact acaaatttct atgactacta 71760 aacatataaa agtaaaattttaaacaaata tatctcttaa tgagaaggaa gagcttttta 71820 tactccaata agttaacgtatccactaata attattattt cttcctagaa caagacagga 71880 ttaagcatca tgaccgtccctattggggga tgtttttata gatgcaagca ctgtggcacc 71940 tactggtata aatgcacctgctgattggaa tgttctttcc ccagatcttc ccctgctggt 72000 ttcttcccag tattcaggtctcagctcaaa tgtgacttcc tcaatgaggc ctcctggtga 72060 tcagatctaa agcaccctctacacaatcac tgtttagtgc tatacccatt aatttactat 72120 catcacactt gtcactatctgcagatgtct tgtttggtta cttttgtngt gtttgtcact 72180 gccagaatat cagttctatgaagaaaaggg ccttgtctat tttgacactt ataganatga 72240 tgnaggnacg acatacaaatggccaatggg catatggaaa aacgcttgac ttcaagagta 72300 ctnatggnta tnaccaacatttatggagta actactttga aaagaaccat tctgtcttta 72360 ctatcaagcc aagatactcaaggaaggcag cagaagtgga agctccatgt gggcagagga 72420 gcctagtctt gagatgtgatttagctggta tttgggtgaa acaaataaac cagcctcaaa 72480 ataacacaag gggccgggtgcagtggctca cgcctgtatc ccagcacttt gggaggctcg 72540 aggcaggcag attacttcaggtgaggagtt cgagaccagc ctggctaaca tggtgaacct 72600 ccat 72604 8 17 DNAArtificial Sequence Primer 8 cggggttggt ttccacc 17 9 19 DNA ArtificialSequence Primer 9 gcgaggagag aaatctggg 19 10 22 DNA Artificial SequencePrimer 10 tgctcactac tttgcagtgt tc 22 11 22 DNA Artificial SequencePrimer 11 tgagatcgtg tcactgcatt ct 22 12 27 DNA Artificial SequencePrimer 12 gtaaatctca aaatgttggg ttaatag 27 13 24 DNA Artificial SequencePrimer 13 ctaactcttc ttctatcatt actc 24 14 22 DNA Artificial SequencePrimer 14 tgtttattgt gtgtctgctg tg 22 15 21 DNA Artificial SequencePrimer 15 ggacaaccaa catgcaaaca g 21 16 22 DNA Artificial SequencePrimer 16 cccaggtgtt ttcaattgat gc 22 17 22 DNA Artificial SequencePrimer 17 agcagttttg tccttccaag tg 22 18 25 DNA Artificial SequencePrimer 18 gtgttttgta atctgatcag atctc 25 19 21 DNA Artificial SequencePrimer 19 gcagtatttc tggtccagat c 21 20 23 DNA Artificial SequencePrimer 20 ggtgcacata gatcatgaaa tgg 23 21 23 DNA Homo sapiens 21taagctgaaa taggtgcctt aag 23 22 23 DNA Artificial Sequence Primer 22tttattccat ttctgtcccc tac 23 23 22 DNA Artificial Sequence Primer 23aaggctcagt taggtctgta tc 22 24 23 DNA Artificial Sequence Primer 24caggagtttt aacgtcttca gac 23 25 23 DNA Artificial Sequence Primer 25gactcagaaa tgtctaccat ttc 23 26 22 DNA Artificial Sequence Primer 26tgtctccact tcttcaaagt gc 22 27 24 DNA Artificial Sequence Primer 27caaaatgtac ctgagaactt aaag 24 28 20 DNA Artificial Sequence Primer 28cacctccaag tttcatggac 20 29 22 DNA Artificial Sequence Primer 29caaggtatgc acgtgtcatt tc 22 30 25 DNA Artificial Sequence Primer 30gaatgtgtat tgggatttag taaac 25 31 25 DNA Artificial Sequence Primer 31ttgagaatta actattcctg tcaac 25 32 20 DNA Artificial Sequence Primer 32ccatcctgga cttttactcc 20 33 23 DNA Artificial Sequence Primer 33ctttcctgca actgtgttta ttg 23 34 147 DNA Homo sapiens 34 ttccctccctttggaacgca gcgtgggcac ctgcaacgca gagaccactg tatccccggt 60 gcagaatgtaatgagtgcct gatacatttg ccgaataaac tattccaagg gttgaacttg 120 ctggaagcaagagaagcact attctgg 147 35 123 DNA Homo sapiens 35 atggagtctt gctctcgttgcccagactgg agtgcactgc tgcgatctca gctcactgca 60 acctctacct cccaggttcaagcgattctc ctgcctcagc ctctcgagtg gctgggacta 120 tag 123 36 398 DNA Homosapiens misc_feature (1)...(398) n = A,T,C or G 36 agttgcgtcc ctctctgttgccaggctgga gttcagtggc atgttcatag ctcactgaag 60 cctcaaattc ntgggttcaagtgaccctcc tacctcagcc ccatgaggac ctgggactac 120 agttccctcc ctttggaacgcagcgtgggc acctgcaacg cagagaccac tgtatctccg 180 gtgcagaatg taatgagtgcctgatacatt tgccgaataa actattccaa gggttgaact 240 tgctggaagc aanagaagcactattctggt aacagcggga acatgaagcc gccactcttg 300 gtgtttattg tgtgtctgctgtggttgaaa gacagtcact gcgcacccac ttggaaggac 360 aaaactgcta tcagtgaaaacctgaagagt ttttctga 398 37 372 DNA Homo sapiens 37 agttgcgtcc ctctctgttgccaggctgga gttcagtggc atgttcttag ctcactgaag 60 cctcaaattc ctgggttcaagtgaccctcc cacctcagcc ccatgaggac ctgggactac 120 agatggagtc ttgctctcgttgcccagact ggagtgcact gctgcgatct cagctcactg 180 caacctctac ctcccaggttcaagcgattc tcctgcctca gcctctcgag tggctgggac 240 tatagtaaca gcgggaacatgaagccgcca ctcttggtgt ttattgtgtg tccgctgtgg 300 ttgaaagaca gtcactgcgcacccacttgg aaggacaaaa ctgctatcag tgaaaacctg 360 aagagttttt ct 372 381815 DNA Cavia sp. CDS (145)...(1542) 38 cttggagtca actgagtgtggactgaaact tccaaaaact gacatgagga gtcactggag 60 aatcatgatc aaggagctacacactctgac ttaactttat tctgtggaca atgagagaca 120 actgcaagga ttaacagtgagaac atg aag ctg cca ctt ttg atg ttt ccc 171 Met Lys Leu Pro Leu Leu MetPhe Pro 1 5 gtg tgt ctg cta tgg ttg aaa gac tgt cat tgt gca cct act tggaag 219 Val Cys Leu Leu Trp Leu Lys Asp Cys His Cys Ala Pro Thr Trp Lys10 15 20 25 gac aaa act gcc atc agt gaa aac gcg aac agt ttt tct gag gctggg 267 Asp Lys Thr Ala Ile Ser Glu Asn Ala Asn Ser Phe Ser Glu Ala Gly30 35 40 gag ata gac gta gat gga gag gtg aag ata gct ttg att ggc att aaa315 Glu Ile Asp Val Asp Gly Glu Val Lys Ile Ala Leu Ile Gly Ile Lys 4550 55 cag atg aaa atc atg atg gaa agg aga gag gaa gaa cac agc aaa cta363 Gln Met Lys Ile Met Met Glu Arg Arg Glu Glu Glu His Ser Lys Leu 6065 70 atg aaa acc ttg aag aag tgc aaa gaa gaa aag cag gag gcc ctg aaa411 Met Lys Thr Leu Lys Lys Cys Lys Glu Glu Lys Gln Glu Ala Leu Lys 7580 85 ctt atg aat gaa gtt cat gaa cac ctg gag gag gaa gaa agc tta tgc459 Leu Met Asn Glu Val His Glu His Leu Glu Glu Glu Glu Ser Leu Cys 9095 100 105 cag gtt tct ctg gca gat tcc tgg gat gaa tgc agg gct tgc ctggaa 507 Gln Val Ser Leu Ala Asp Ser Trp Asp Glu Cys Arg Ala Cys Leu Glu110 115 120 agt aac tgc atg agg ttt gat acc acc tgc caa cct gca tgg tcctct 555 Ser Asn Cys Met Arg Phe Asp Thr Thr Cys Gln Pro Ala Trp Ser Ser125 130 135 gtg aaa aat atg gtg gaa cag ttt ttc agg aag atc tat cag tttctg 603 Val Lys Asn Met Val Glu Gln Phe Phe Arg Lys Ile Tyr Gln Phe Leu140 145 150 ttt cct ctc cag gaa aat gac aga agt ggc cct gtc agc aaa ggggtc 651 Phe Pro Leu Gln Glu Asn Asp Arg Ser Gly Pro Val Ser Lys Gly Val155 160 165 act gag gaa gat gcg cag gtg tca cac ata gag cat gtg ttc agccag 699 Thr Glu Glu Asp Ala Gln Val Ser His Ile Glu His Val Phe Ser Gln170 175 180 185 ctg agc gca gat gtg aca tct ctc ttc aac aga agc ctt tacgtc ttc 747 Leu Ser Ala Asp Val Thr Ser Leu Phe Asn Arg Ser Leu Tyr ValPhe 190 195 200 aaa cag ctg cgg cga gaa ttt gac cag gct ttt cag tca tatttc aca 795 Lys Gln Leu Arg Arg Glu Phe Asp Gln Ala Phe Gln Ser Tyr PheThr 205 210 215 tcg ggg act gac gtt aca gag cct ttc ttt ttt cca tct ttgtcc aag 843 Ser Gly Thr Asp Val Thr Glu Pro Phe Phe Phe Pro Ser Leu SerLys 220 225 230 gag cca gcc tac aga gca gat gct gag cca agc tgg gcc attccc aat 891 Glu Pro Ala Tyr Arg Ala Asp Ala Glu Pro Ser Trp Ala Ile ProAsn 235 240 245 gtc ttc cag ctg ctc tgc aac ttg agt ttc tca gtt tat caaagt gtc 939 Val Phe Gln Leu Leu Cys Asn Leu Ser Phe Ser Val Tyr Gln SerVal 250 255 260 265 agt gaa aaa ctc atc aca acc ctg cgt gcc aca gag gaccct cca aaa 987 Ser Glu Lys Leu Ile Thr Thr Leu Arg Ala Thr Glu Asp ProPro Lys 270 275 280 caa gac aaa gac tcc aac cag gga ggc ccg att tca aagata cta cct 1035 Gln Asp Lys Asp Ser Asn Gln Gly Gly Pro Ile Ser Lys IleLeu Pro 285 290 295 gag caa gac aga ggc tca gat ggg aaa ctt ggc cag aatttg tct gat 1083 Glu Gln Asp Arg Gly Ser Asp Gly Lys Leu Gly Gln Asn LeuSer Asp 300 305 310 tgc gtt aat ttt cgc aag aga tgc cag aaa tgc cag gattat cta tct 1131 Cys Val Asn Phe Arg Lys Arg Cys Gln Lys Cys Gln Asp TyrLeu Ser 315 320 325 gat gac tgc cct aat gtg cct gaa cta tac aga gaa ctcaat gag gcc 1179 Asp Asp Cys Pro Asn Val Pro Glu Leu Tyr Arg Glu Leu AsnGlu Ala 330 335 340 345 ctc cga ctg gtc agt aga tcc aat cag caa tac gaccag gtg gtg cag 1227 Leu Arg Leu Val Ser Arg Ser Asn Gln Gln Tyr Asp GlnVal Val Gln 350 355 360 atg acc cag tat cac ctg gaa gac acc acg ctt ctgatg gag aag atg 1275 Met Thr Gln Tyr His Leu Glu Asp Thr Thr Leu Leu MetGlu Lys Met 365 370 375 aga gag cag ttt ggc tgg gtt tct gaa ctg gca taccag tcc cca gga 1323 Arg Glu Gln Phe Gly Trp Val Ser Glu Leu Ala Tyr GlnSer Pro Gly 380 385 390 gct gag gac atc ttt aat cca gtg aaa gta atg gtagcc cta agt gct 1371 Ala Glu Asp Ile Phe Asn Pro Val Lys Val Met Val AlaLeu Ser Ala 395 400 405 cat gaa gga aat tct tct gat caa gat gac aca gtggtt cct tca agc 1419 His Glu Gly Asn Ser Ser Asp Gln Asp Asp Thr Val ValPro Ser Ser 410 415 420 425 ctc ctg cct tcc tct aac ttc aca ctc agc agccct ctt gaa aag agt 1467 Leu Leu Pro Ser Ser Asn Phe Thr Leu Ser Ser ProLeu Glu Lys Ser 430 435 440 gct ggc aac gct aac ttc att gat cac gtg gtagag aag gtt ctt cag 1515 Ala Gly Asn Ala Asn Phe Ile Asp His Val Val GluLys Val Leu Gln 445 450 455 cac ttt aag gag cac ttt aaa act tggtaagaagatt tagtccatcc 1562 His Phe Lys Glu His Phe Lys Thr Trp 460 465tataatcagc aagaattaca ccttcggcca agacctgaga attctgaaaa tacaaagcag 1622gctaacacaa tgaacacagc tgcatgaaag ttaggtatat attaggaagc actattggtt 1682tactttgttg aatggaagtt taatagctat tcaaattgag ttaatataaa aatttcttcc 1742taaaaagtaa aatgtacata tgtagaatat gatgcattag ttctttgtat actaaataaa 1802tactgagtcc cct 1815 39 466 PRT Cavia sp. 39 Met Lys Leu Pro Leu Leu MetPhe Pro Val Cys Leu Leu Trp Leu Lys 1 5 10 15 Asp Cys His Cys Ala ProThr Trp Lys Asp Lys Thr Ala Ile Ser Glu 20 25 30 Asn Ala Asn Ser Phe SerGlu Ala Gly Glu Ile Asp Val Asp Gly Glu 35 40 45 Val Lys Ile Ala Leu IleGly Ile Lys Gln Met Lys Ile Met Met Glu 50 55 60 Arg Arg Glu Glu Glu HisSer Lys Leu Met Lys Thr Leu Lys Lys Cys 65 70 75 80 Lys Glu Glu Lys GlnGlu Ala Leu Lys Leu Met Asn Glu Val His Glu 85 90 95 His Leu Glu Glu GluGlu Ser Leu Cys Gln Val Ser Leu Ala Asp Ser 100 105 110 Trp Asp Glu CysArg Ala Cys Leu Glu Ser Asn Cys Met Arg Phe Asp 115 120 125 Thr Thr CysGln Pro Ala Trp Ser Ser Val Lys Asn Met Val Glu Gln 130 135 140 Phe PheArg Lys Ile Tyr Gln Phe Leu Phe Pro Leu Gln Glu Asn Asp 145 150 155 160Arg Ser Gly Pro Val Ser Lys Gly Val Thr Glu Glu Asp Ala Gln Val 165 170175 Ser His Ile Glu His Val Phe Ser Gln Leu Ser Ala Asp Val Thr Ser 180185 190 Leu Phe Asn Arg Ser Leu Tyr Val Phe Lys Gln Leu Arg Arg Glu Phe195 200 205 Asp Gln Ala Phe Gln Ser Tyr Phe Thr Ser Gly Thr Asp Val ThrGlu 210 215 220 Pro Phe Phe Phe Pro Ser Leu Ser Lys Glu Pro Ala Tyr ArgAla Asp 225 230 235 240 Ala Glu Pro Ser Trp Ala Ile Pro Asn Val Phe GlnLeu Leu Cys Asn 245 250 255 Leu Ser Phe Ser Val Tyr Gln Ser Val Ser GluLys Leu Ile Thr Thr 260 265 270 Leu Arg Ala Thr Glu Asp Pro Pro Lys GlnAsp Lys Asp Ser Asn Gln 275 280 285 Gly Gly Pro Ile Ser Lys Ile Leu ProGlu Gln Asp Arg Gly Ser Asp 290 295 300 Gly Lys Leu Gly Gln Asn Leu SerAsp Cys Val Asn Phe Arg Lys Arg 305 310 315 320 Cys Gln Lys Cys Gln AspTyr Leu Ser Asp Asp Cys Pro Asn Val Pro 325 330 335 Glu Leu Tyr Arg GluLeu Asn Glu Ala Leu Arg Leu Val Ser Arg Ser 340 345 350 Asn Gln Gln TyrAsp Gln Val Val Gln Met Thr Gln Tyr His Leu Glu 355 360 365 Asp Thr ThrLeu Leu Met Glu Lys Met Arg Glu Gln Phe Gly Trp Val 370 375 380 Ser GluLeu Ala Tyr Gln Ser Pro Gly Ala Glu Asp Ile Phe Asn Pro 385 390 395 400Val Lys Val Met Val Ala Leu Ser Ala His Glu Gly Asn Ser Ser Asp 405 410415 Gln Asp Asp Thr Val Val Pro Ser Ser Leu Leu Pro Ser Ser Asn Phe 420425 430 Thr Leu Ser Ser Pro Leu Glu Lys Ser Ala Gly Asn Ala Asn Phe Ile435 440 445 Asp His Val Val Glu Lys Val Leu Gln His Phe Lys Glu His PheLys 450 455 460 Thr Trp 465 40 1767 DNA Cavia sp. CDS (145)...(1494) 40cttggagtca actgagtgtg gactgaaact tccaaaaact gacatgagga gtcactggag 60aatcatgatc aaggagctac acactctgac ttaactttat tctgtggaca atgagagaca 120actgcaagga ttaacagtga gaac atg aag ctg cca ctt ttg atg ttt ccc 171 MetLys Leu Pro Leu Leu Met Phe Pro 1 5 gtg tgt ctg cta tgg ttg aaa gac tgtcat tgt gca cct act tgg aag 219 Val Cys Leu Leu Trp Leu Lys Asp Cys HisCys Ala Pro Thr Trp Lys 10 15 20 25 gac aaa act gcc atc agt gaa aac gcgaac agt ttt tct gag gct ggg 267 Asp Lys Thr Ala Ile Ser Glu Asn Ala AsnSer Phe Ser Glu Ala Gly 30 35 40 gag ata gac gta gat gga gag gtg aag atagct ttg att ggc att aaa 315 Glu Ile Asp Val Asp Gly Glu Val Lys Ile AlaLeu Ile Gly Ile Lys 45 50 55 cag atg aaa atc atg atg gaa agg aga gag gaagaa cac agc aaa cta 363 Gln Met Lys Ile Met Met Glu Arg Arg Glu Glu GluHis Ser Lys Leu 60 65 70 atg aaa acc ttg aag aag tgc aaa gaa gaa aag caggag gcc ctg aaa 411 Met Lys Thr Leu Lys Lys Cys Lys Glu Glu Lys Gln GluAla Leu Lys 75 80 85 ctt atg aat gaa gtt cat gaa cac ctg gag gag gaa gaaagc tta tgc 459 Leu Met Asn Glu Val His Glu His Leu Glu Glu Glu Glu SerLeu Cys 90 95 100 105 cag gtt tct ctg gca gat tcc tgg gat gaa tgc agggct tgc ctg gaa 507 Gln Val Ser Leu Ala Asp Ser Trp Asp Glu Cys Arg AlaCys Leu Glu 110 115 120 agt aac tgc atg agg ttt gat acc acc tgc caa cctgca tgg tcc tct 555 Ser Asn Cys Met Arg Phe Asp Thr Thr Cys Gln Pro AlaTrp Ser Ser 125 130 135 gtg aaa aat atg gaa aat gac aga agt ggc cct gtcagc aaa ggg gtc 603 Val Lys Asn Met Glu Asn Asp Arg Ser Gly Pro Val SerLys Gly Val 140 145 150 act gag gaa gat gcg cag gtg tca cac ata gag catgtg ttc agc cag 651 Thr Glu Glu Asp Ala Gln Val Ser His Ile Glu His ValPhe Ser Gln 155 160 165 ctg agc gca gat gtg aca tct ctc ttc aac aga agcctt tac gtc ttc 699 Leu Ser Ala Asp Val Thr Ser Leu Phe Asn Arg Ser LeuTyr Val Phe 170 175 180 185 aaa cag ctg cgg cga gaa ttt gac cag gct tttcag tca tat ttc aca 747 Lys Gln Leu Arg Arg Glu Phe Asp Gln Ala Phe GlnSer Tyr Phe Thr 190 195 200 tcg ggg act gac gtt aca gag cct ttc ttt tttcca tct ttg tcc aag 795 Ser Gly Thr Asp Val Thr Glu Pro Phe Phe Phe ProSer Leu Ser Lys 205 210 215 gag cca gcc tac aga gca gat gct gag cca agctgg gcc att ccc aat 843 Glu Pro Ala Tyr Arg Ala Asp Ala Glu Pro Ser TrpAla Ile Pro Asn 220 225 230 gtc ttc cag ctg ctc tgc aac ttg agt ttc tcagtt tat caa agt gtc 891 Val Phe Gln Leu Leu Cys Asn Leu Ser Phe Ser ValTyr Gln Ser Val 235 240 245 agt gaa aaa ctc atc aca acc ctg cgt gcc acagag gac cct cca aaa 939 Ser Glu Lys Leu Ile Thr Thr Leu Arg Ala Thr GluAsp Pro Pro Lys 250 255 260 265 caa gac aaa gac tcc aac cag gga ggc ccgatt tca aag ata cta cct 987 Gln Asp Lys Asp Ser Asn Gln Gly Gly Pro IleSer Lys Ile Leu Pro 270 275 280 gag caa gac aga ggc tca gat ggg aaa cttggc cag aat ttg tct gat 1035 Glu Gln Asp Arg Gly Ser Asp Gly Lys Leu GlyGln Asn Leu Ser Asp 285 290 295 tgc gtt aat ttt cgc aag aga tgc cag aaatgc cag gat tat cta tct 1083 Cys Val Asn Phe Arg Lys Arg Cys Gln Lys CysGln Asp Tyr Leu Ser 300 305 310 gat gac tgc cct aat gtg cct gaa cta tacaga gaa ctc aat gag gcc 1131 Asp Asp Cys Pro Asn Val Pro Glu Leu Tyr ArgGlu Leu Asn Glu Ala 315 320 325 ctc cga ctg gtc agt aga tcc aat cag caatac gac cag gtg gtg cag 1179 Leu Arg Leu Val Ser Arg Ser Asn Gln Gln TyrAsp Gln Val Val Gln 330 335 340 345 atg acc cag tat cac ctg gaa gac accacg ctt ctg atg gag aag atg 1227 Met Thr Gln Tyr His Leu Glu Asp Thr ThrLeu Leu Met Glu Lys Met 350 355 360 aga gag cag ttt ggc tgg gtt tct gaactg gca tac cag tcc cca gga 1275 Arg Glu Gln Phe Gly Trp Val Ser Glu LeuAla Tyr Gln Ser Pro Gly 365 370 375 gct gag gac atc ttt aat cca gtg aaagta atg gta gcc cta agt gct 1323 Ala Glu Asp Ile Phe Asn Pro Val Lys ValMet Val Ala Leu Ser Ala 380 385 390 cat gaa gga aat tct tct gat caa gatgac aca gtg gtt cct tca agc 1371 His Glu Gly Asn Ser Ser Asp Gln Asp AspThr Val Val Pro Ser Ser 395 400 405 ctc ctg cct tcc tct aac ttc aca ctcagc agc cct ctt gaa aag agt 1419 Leu Leu Pro Ser Ser Asn Phe Thr Leu SerSer Pro Leu Glu Lys Ser 410 415 420 425 gct ggc aac gct aac ttc att gatcac gtg gta gag aag gtt ctt cag 1467 Ala Gly Asn Ala Asn Phe Ile Asp HisVal Val Glu Lys Val Leu Gln 430 435 440 cac ttt aag gag cac ttt aaa acttgg taagaagatt tagtccatcc 1514 His Phe Lys Glu His Phe Lys Thr Trp 445450 tataatcagc aagaattaca ccttcggcca agacctgaga attctgaaaa tacaaagcag1574 gctaacacaa tgaacacagc tgcatgaaag ttaggtatat attaggaagc actattggtt1634 tactttgttg aatggaagtt taatagctat tcaaattgag ttaatataaa aatttcttcc1694 taaaaagtaa aatgtacata tgtagaatat gatgcattag ttctttgtat actaaataaa1754 tactgagtcc cct 1767 41 450 PRT Cavia sp. 41 Met Lys Leu Pro Leu LeuMet Phe Pro Val Cys Leu Leu Trp Leu Lys 1 5 10 15 Asp Cys His Cys AlaPro Thr Trp Lys Asp Lys Thr Ala Ile Ser Glu 20 25 30 Asn Ala Asn Ser PheSer Glu Ala Gly Glu Ile Asp Val Asp Gly Glu 35 40 45 Val Lys Ile Ala LeuIle Gly Ile Lys Gln Met Lys Ile Met Met Glu 50 55 60 Arg Arg Glu Glu GluHis Ser Lys Leu Met Lys Thr Leu Lys Lys Cys 65 70 75 80 Lys Glu Glu LysGln Glu Ala Leu Lys Leu Met Asn Glu Val His Glu 85 90 95 His Leu Glu GluGlu Glu Ser Leu Cys Gln Val Ser Leu Ala Asp Ser 100 105 110 Trp Asp GluCys Arg Ala Cys Leu Glu Ser Asn Cys Met Arg Phe Asp 115 120 125 Thr ThrCys Gln Pro Ala Trp Ser Ser Val Lys Asn Met Glu Asn Asp 130 135 140 ArgSer Gly Pro Val Ser Lys Gly Val Thr Glu Glu Asp Ala Gln Val 145 150 155160 Ser His Ile Glu His Val Phe Ser Gln Leu Ser Ala Asp Val Thr Ser 165170 175 Leu Phe Asn Arg Ser Leu Tyr Val Phe Lys Gln Leu Arg Arg Glu Phe180 185 190 Asp Gln Ala Phe Gln Ser Tyr Phe Thr Ser Gly Thr Asp Val ThrGlu 195 200 205 Pro Phe Phe Phe Pro Ser Leu Ser Lys Glu Pro Ala Tyr ArgAla Asp 210 215 220 Ala Glu Pro Ser Trp Ala Ile Pro Asn Val Phe Gln LeuLeu Cys Asn 225 230 235 240 Leu Ser Phe Ser Val Tyr Gln Ser Val Ser GluLys Leu Ile Thr Thr 245 250 255 Leu Arg Ala Thr Glu Asp Pro Pro Lys GlnAsp Lys Asp Ser Asn Gln 260 265 270 Gly Gly Pro Ile Ser Lys Ile Leu ProGlu Gln Asp Arg Gly Ser Asp 275 280 285 Gly Lys Leu Gly Gln Asn Leu SerAsp Cys Val Asn Phe Arg Lys Arg 290 295 300 Cys Gln Lys Cys Gln Asp TyrLeu Ser Asp Asp Cys Pro Asn Val Pro 305 310 315 320 Glu Leu Tyr Arg GluLeu Asn Glu Ala Leu Arg Leu Val Ser Arg Ser 325 330 335 Asn Gln Gln TyrAsp Gln Val Val Gln Met Thr Gln Tyr His Leu Glu 340 345 350 Asp Thr ThrLeu Leu Met Glu Lys Met Arg Glu Gln Phe Gly Trp Val 355 360 365 Ser GluLeu Ala Tyr Gln Ser Pro Gly Ala Glu Asp Ile Phe Asn Pro 370 375 380 ValLys Val Met Val Ala Leu Ser Ala His Glu Gly Asn Ser Ser Asp 385 390 395400 Gln Asp Asp Thr Val Val Pro Ser Ser Leu Leu Pro Ser Ser Asn Phe 405410 415 Thr Leu Ser Ser Pro Leu Glu Lys Ser Ala Gly Asn Ala Asn Phe Ile420 425 430 Asp His Val Val Glu Lys Val Leu Gln His Phe Lys Glu His PheLys 435 440 445 Thr Trp 450 42 1539 DNA Cavia sp. CDS (145)...(1266) 42cttggagtca actgagtgtg gactgaaact tccaaaaact gacatgagga gtcactggag 60aatcatgatc aaggagctac acactctgac ttaactttat tctgtggaca atgagagaca 120actgcaagga ttaacagtga gaac atg aag ctg cca ctt ttg atg ttt ccc 171 MetLys Leu Pro Leu Leu Met Phe Pro 1 5 gtg tgt ctg cta tgg ttg aaa gac tgtcat tgt gca cct act tgg aag 219 Val Cys Leu Leu Trp Leu Lys Asp Cys HisCys Ala Pro Thr Trp Lys 10 15 20 25 gac aaa act gcc atc agt gaa aac gcgaac agt ttt tct gag gct ggg 267 Asp Lys Thr Ala Ile Ser Glu Asn Ala AsnSer Phe Ser Glu Ala Gly 30 35 40 gag ata gac gta gat gga gag gtg aag atagct ttg att ggc att aaa 315 Glu Ile Asp Val Asp Gly Glu Val Lys Ile AlaLeu Ile Gly Ile Lys 45 50 55 cag atg aaa atc atg atg gaa agg aga gag gaagaa cac agc aaa cta 363 Gln Met Lys Ile Met Met Glu Arg Arg Glu Glu GluHis Ser Lys Leu 60 65 70 atg aaa acc ttg aag aag tgc aaa gaa gaa aag caggag gcc ctg aaa 411 Met Lys Thr Leu Lys Lys Cys Lys Glu Glu Lys Gln GluAla Leu Lys 75 80 85 ctt atg aat gaa gtt cat gaa cac ctg gag gag gaa gaaagc tta tgc 459 Leu Met Asn Glu Val His Glu His Leu Glu Glu Glu Glu SerLeu Cys 90 95 100 105 cag gtt tct ctg gca gat tcc tgg gat gaa tgc agggct tgc ctg gaa 507 Gln Val Ser Leu Ala Asp Ser Trp Asp Glu Cys Arg AlaCys Leu Glu 110 115 120 agt aac tgc atg agg ttt gat acc acc tgc caa cctgca tgg tcc tct 555 Ser Asn Cys Met Arg Phe Asp Thr Thr Cys Gln Pro AlaTrp Ser Ser 125 130 135 gtg aaa aat atg gag cca gcc tac aga gca gat gctgag cca agc tgg 603 Val Lys Asn Met Glu Pro Ala Tyr Arg Ala Asp Ala GluPro Ser Trp 140 145 150 gcc att ccc aat gtc ttc cag ctg ctc tgc aac ttgagt ttc tca gtt 651 Ala Ile Pro Asn Val Phe Gln Leu Leu Cys Asn Leu SerPhe Ser Val 155 160 165 tat caa agt gtc agt gaa aaa ctc atc aca acc ctgcgt gcc aca gag 699 Tyr Gln Ser Val Ser Glu Lys Leu Ile Thr Thr Leu ArgAla Thr Glu 170 175 180 185 gac cct cca aaa caa gac aaa gac tcc aac caggga ggc ccg att tca 747 Asp Pro Pro Lys Gln Asp Lys Asp Ser Asn Gln GlyGly Pro Ile Ser 190 195 200 aag ata cta cct gag caa gac aga ggc tca gatggg aaa ctt ggc cag 795 Lys Ile Leu Pro Glu Gln Asp Arg Gly Ser Asp GlyLys Leu Gly Gln 205 210 215 aat ttg tct gat tgc gtt aat ttt cgc aag agatgc cag aaa tgc cag 843 Asn Leu Ser Asp Cys Val Asn Phe Arg Lys Arg CysGln Lys Cys Gln 220 225 230 gat tat cta tct gat gac tgc cct aat gtg cctgaa cta tac aga gaa 891 Asp Tyr Leu Ser Asp Asp Cys Pro Asn Val Pro GluLeu Tyr Arg Glu 235 240 245 ctc aat gag gcc ctc cga ctg gtc agt aga tccaat cag caa tac gac 939 Leu Asn Glu Ala Leu Arg Leu Val Ser Arg Ser AsnGln Gln Tyr Asp 250 255 260 265 cag gtg gtg cag atg acc cag tat cac ctggaa gac acc acg ctt ctg 987 Gln Val Val Gln Met Thr Gln Tyr His Leu GluAsp Thr Thr Leu Leu 270 275 280 atg gag aag atg aga gag cag ttt ggc tgggtt tct gaa ctg gca tac 1035 Met Glu Lys Met Arg Glu Gln Phe Gly Trp ValSer Glu Leu Ala Tyr 285 290 295 cag tcc cca gga gct gag gac atc ttt aatcca gtg aaa gta atg gta 1083 Gln Ser Pro Gly Ala Glu Asp Ile Phe Asn ProVal Lys Val Met Val 300 305 310 gcc cta agt gct cat gaa gga aat tct tctgat caa gat gac aca gtg 1131 Ala Leu Ser Ala His Glu Gly Asn Ser Ser AspGln Asp Asp Thr Val 315 320 325 gtt cct tca agc ctc ctg cct tcc tct aacttc aca ctc agc agc cct 1179 Val Pro Ser Ser Leu Leu Pro Ser Ser Asn PheThr Leu Ser Ser Pro 330 335 340 345 ctt gaa aag agt gct ggc aac gct aacttc att gat cac gtg gta gag 1227 Leu Glu Lys Ser Ala Gly Asn Ala Asn PheIle Asp His Val Val Glu 350 355 360 aag gtt ctt cag cac ttt aag gag cacttt aaa act tgg taagaagatt 1276 Lys Val Leu Gln His Phe Lys Glu His PheLys Thr Trp 365 370 tagtccatcc tataatcagc aagaattaca ccttcggccaagacctgaga attctgaaaa 1336 tacaaagcag gctaacacaa tgaacacagc tgcatgaaagttaggtatat attaggaagc 1396 actattggtt tactttgttg aatggaagtt taatagctattcaaattgag ttaatataaa 1456 aatttcttcc taaaaagtaa aatgtacata tgtagaatatgatgcattag ttctttgtat 1516 actaaataaa tactgagtcc cct 1539 43 374 PRTCavia sp. 43 Met Lys Leu Pro Leu Leu Met Phe Pro Val Cys Leu Leu Trp LeuLys 1 5 10 15 Asp Cys His Cys Ala Pro Thr Trp Lys Asp Lys Thr Ala IleSer Glu 20 25 30 Asn Ala Asn Ser Phe Ser Glu Ala Gly Glu Ile Asp Val AspGly Glu 35 40 45 Val Lys Ile Ala Leu Ile Gly Ile Lys Gln Met Lys Ile MetMet Glu 50 55 60 Arg Arg Glu Glu Glu His Ser Lys Leu Met Lys Thr Leu LysLys Cys 65 70 75 80 Lys Glu Glu Lys Gln Glu Ala Leu Lys Leu Met Asn GluVal His Glu 85 90 95 His Leu Glu Glu Glu Glu Ser Leu Cys Gln Val Ser LeuAla Asp Ser 100 105 110 Trp Asp Glu Cys Arg Ala Cys Leu Glu Ser Asn CysMet Arg Phe Asp 115 120 125 Thr Thr Cys Gln Pro Ala Trp Ser Ser Val LysAsn Met Glu Pro Ala 130 135 140 Tyr Arg Ala Asp Ala Glu Pro Ser Trp AlaIle Pro Asn Val Phe Gln 145 150 155 160 Leu Leu Cys Asn Leu Ser Phe SerVal Tyr Gln Ser Val Ser Glu Lys 165 170 175 Leu Ile Thr Thr Leu Arg AlaThr Glu Asp Pro Pro Lys Gln Asp Lys 180 185 190 Asp Ser Asn Gln Gly GlyPro Ile Ser Lys Ile Leu Pro Glu Gln Asp 195 200 205 Arg Gly Ser Asp GlyLys Leu Gly Gln Asn Leu Ser Asp Cys Val Asn 210 215 220 Phe Arg Lys ArgCys Gln Lys Cys Gln Asp Tyr Leu Ser Asp Asp Cys 225 230 235 240 Pro AsnVal Pro Glu Leu Tyr Arg Glu Leu Asn Glu Ala Leu Arg Leu 245 250 255 ValSer Arg Ser Asn Gln Gln Tyr Asp Gln Val Val Gln Met Thr Gln 260 265 270Tyr His Leu Glu Asp Thr Thr Leu Leu Met Glu Lys Met Arg Glu Gln 275 280285 Phe Gly Trp Val Ser Glu Leu Ala Tyr Gln Ser Pro Gly Ala Glu Asp 290295 300 Ile Phe Asn Pro Val Lys Val Met Val Ala Leu Ser Ala His Glu Gly305 310 315 320 Asn Ser Ser Asp Gln Asp Asp Thr Val Val Pro Ser Ser LeuLeu Pro 325 330 335 Ser Ser Asn Phe Thr Leu Ser Ser Pro Leu Glu Lys SerAla Gly Asn 340 345 350 Ala Asn Phe Ile Asp His Val Val Glu Lys Val LeuGln His Phe Lys 355 360 365 Glu His Phe Lys Thr Trp 370 44 1536 DNACavia sp. CDS (145)...(1263) 44 cttggagtca actgagtgtg gactgaaacttccaaaaact gacatgagga gtcactggag 60 aatcatgatc aaggagctac acactctgacttaactttat tctgtggaca atgagagaca 120 actgcaagga ttaacagtga gaac atg aagctg cca ctt ttg atg ttt ccc 171 Met Lys Leu Pro Leu Leu Met Phe Pro 1 5gtg tgt ctg cta tgg ttg aaa gac tgt cat tgt gca cct act tgg aag 219 ValCys Leu Leu Trp Leu Lys Asp Cys His Cys Ala Pro Thr Trp Lys 10 15 20 25gac aaa act gcc atc agt gaa aac gcg aac agt ttt tct gag gct ggg 267 AspLys Thr Ala Ile Ser Glu Asn Ala Asn Ser Phe Ser Glu Ala Gly 30 35 40 gagata gac gta gat gga gag gtg aag ata gct ttg att ggc att aaa 315 Glu IleAsp Val Asp Gly Glu Val Lys Ile Ala Leu Ile Gly Ile Lys 45 50 55 cag atgaaa atc atg atg gaa agg aga gag gaa gaa cac agc aaa cta 363 Gln Met LysIle Met Met Glu Arg Arg Glu Glu Glu His Ser Lys Leu 60 65 70 atg aaa accttg aag aag tgc aaa gaa gaa aag cag gag gcc ctg aaa 411 Met Lys Thr LeuLys Lys Cys Lys Glu Glu Lys Gln Glu Ala Leu Lys 75 80 85 ctt atg aat gaagtt cat gaa cac ctg gag gag gaa gaa agc tta tgc 459 Leu Met Asn Glu ValHis Glu His Leu Glu Glu Glu Glu Ser Leu Cys 90 95 100 105 cag gtt tctctg gca gat tcc tgg gat gaa tgc agg gct tgc ctg gaa 507 Gln Val Ser LeuAla Asp Ser Trp Asp Glu Cys Arg Ala Cys Leu Glu 110 115 120 agt aac tgcatg agg ttt gat acc acc tgc caa cct gca tgg tcc tct 555 Ser Asn Cys MetArg Phe Asp Thr Thr Cys Gln Pro Ala Trp Ser Ser 125 130 135 gtg aaa aatatg cca gcc tac aga gca gat gct gag cca agc tgg gcc 603 Val Lys Asn MetPro Ala Tyr Arg Ala Asp Ala Glu Pro Ser Trp Ala 140 145 150 att ccc aatgtc ttc cag ctg ctc tgc aac ttg agt ttc tca gtt tat 651 Ile Pro Asn ValPhe Gln Leu Leu Cys Asn Leu Ser Phe Ser Val Tyr 155 160 165 caa agt gtcagt gaa aaa ctc atc aca acc ctg cgt gcc aca gag gac 699 Gln Ser Val SerGlu Lys Leu Ile Thr Thr Leu Arg Ala Thr Glu Asp 170 175 180 185 cct ccaaaa caa gac aaa gac tcc aac cag gga ggc ccg att tca aag 747 Pro Pro LysGln Asp Lys Asp Ser Asn Gln Gly Gly Pro Ile Ser Lys 190 195 200 ata ctacct gag caa gac aga ggc tca gat ggg aaa ctt ggc cag aat 795 Ile Leu ProGlu Gln Asp Arg Gly Ser Asp Gly Lys Leu Gly Gln Asn 205 210 215 ttg tctgat tgc gtt aat ttt cgc aag aga tgc cag aaa tgc cag gat 843 Leu Ser AspCys Val Asn Phe Arg Lys Arg Cys Gln Lys Cys Gln Asp 220 225 230 tat ctatct gat gac tgc cct aat gtg cct gaa cta tac aga gaa ctc 891 Tyr Leu SerAsp Asp Cys Pro Asn Val Pro Glu Leu Tyr Arg Glu Leu 235 240 245 aat gaggcc ctc cga ctg gtc agt aga tcc aat cag caa tac gac cag 939 Asn Glu AlaLeu Arg Leu Val Ser Arg Ser Asn Gln Gln Tyr Asp Gln 250 255 260 265 gtggtg cag atg acc cag tat cac ctg gaa gac acc acg ctt ctg atg 987 Val ValGln Met Thr Gln Tyr His Leu Glu Asp Thr Thr Leu Leu Met 270 275 280 gagaag atg aga gag cag ttt ggc tgg gtt tct gaa ctg gca tac cag 1035 Glu LysMet Arg Glu Gln Phe Gly Trp Val Ser Glu Leu Ala Tyr Gln 285 290 295 tcccca gga gct gag gac atc ttt aat cca gtg aaa gta atg gta gcc 1083 Ser ProGly Ala Glu Asp Ile Phe Asn Pro Val Lys Val Met Val Ala 300 305 310 ctaagt gct cat gaa gga aat tct tct gat caa gat gac aca gtg gtt 1131 Leu SerAla His Glu Gly Asn Ser Ser Asp Gln Asp Asp Thr Val Val 315 320 325 ccttca agc ctc ctg cct tcc tct aac ttc aca ctc agc agc cct ctt 1179 Pro SerSer Leu Leu Pro Ser Ser Asn Phe Thr Leu Ser Ser Pro Leu 330 335 340 345gaa aag agt gct ggc aac gct aac ttc att gat cac gtg gta gag aag 1227 GluLys Ser Ala Gly Asn Ala Asn Phe Ile Asp His Val Val Glu Lys 350 355 360gtt ctt cag cac ttt aag gag cac ttt aaa act tgg taagaagatt 1273 Val LeuGln His Phe Lys Glu His Phe Lys Thr Trp 365 370 tagtccatcc tataatcagcaagaattaca ccttcggcca agacctgaga attctgaaaa 1333 tacaaagcag gctaacacaatgaacacagc tgcatgaaag ttaggtatat attaggaagc 1393 actattggtt tactttgttgaatggaagtt taatagctat tcaaattgag ttaatataaa 1453 aatttcttcc taaaaagtaaaatgtacata tgtagaatat gatgcattag ttctttgtat 1513 actaaataaa tactgagtcccct 1536 45 373 PRT Cavia sp. 45 Met Lys Leu Pro Leu Leu Met Phe Pro ValCys Leu Leu Trp Leu Lys 1 5 10 15 Asp Cys His Cys Ala Pro Thr Trp LysAsp Lys Thr Ala Ile Ser Glu 20 25 30 Asn Ala Asn Ser Phe Ser Glu Ala GlyGlu Ile Asp Val Asp Gly Glu 35 40 45 Val Lys Ile Ala Leu Ile Gly Ile LysGln Met Lys Ile Met Met Glu 50 55 60 Arg Arg Glu Glu Glu His Ser Lys LeuMet Lys Thr Leu Lys Lys Cys 65 70 75 80 Lys Glu Glu Lys Gln Glu Ala LeuLys Leu Met Asn Glu Val His Glu 85 90 95 His Leu Glu Glu Glu Glu Ser LeuCys Gln Val Ser Leu Ala Asp Ser 100 105 110 Trp Asp Glu Cys Arg Ala CysLeu Glu Ser Asn Cys Met Arg Phe Asp 115 120 125 Thr Thr Cys Gln Pro AlaTrp Ser Ser Val Lys Asn Met Pro Ala Tyr 130 135 140 Arg Ala Asp Ala GluPro Ser Trp Ala Ile Pro Asn Val Phe Gln Leu 145 150 155 160 Leu Cys AsnLeu Ser Phe Ser Val Tyr Gln Ser Val Ser Glu Lys Leu 165 170 175 Ile ThrThr Leu Arg Ala Thr Glu Asp Pro Pro Lys Gln Asp Lys Asp 180 185 190 SerAsn Gln Gly Gly Pro Ile Ser Lys Ile Leu Pro Glu Gln Asp Arg 195 200 205Gly Ser Asp Gly Lys Leu Gly Gln Asn Leu Ser Asp Cys Val Asn Phe 210 215220 Arg Lys Arg Cys Gln Lys Cys Gln Asp Tyr Leu Ser Asp Asp Cys Pro 225230 235 240 Asn Val Pro Glu Leu Tyr Arg Glu Leu Asn Glu Ala Leu Arg LeuVal 245 250 255 Ser Arg Ser Asn Gln Gln Tyr Asp Gln Val Val Gln Met ThrGln Tyr 260 265 270 His Leu Glu Asp Thr Thr Leu Leu Met Glu Lys Met ArgGlu Gln Phe 275 280 285 Gly Trp Val Ser Glu Leu Ala Tyr Gln Ser Pro GlyAla Glu Asp Ile 290 295 300 Phe Asn Pro Val Lys Val Met Val Ala Leu SerAla His Glu Gly Asn 305 310 315 320 Ser Ser Asp Gln Asp Asp Thr Val ValPro Ser Ser Leu Leu Pro Ser 325 330 335 Ser Asn Phe Thr Leu Ser Ser ProLeu Glu Lys Ser Ala Gly Asn Ala 340 345 350 Asn Phe Ile Asp His Val ValGlu Lys Val Leu Gln His Phe Lys Glu 355 360 365 His Phe Lys Thr Trp 37046 2464 DNA Bos sp. 46 gcaacctcgt tggtgagagc ctgcagttag tgtcacggcggaaacatgaa gccgccactc 60 ttggtgttta ttgtgtatct gctgcggctg agagactgtcagtgtgcgcc tacagggaag 120 gaccgaactt ccatccgtga agacccgaag ggtttttccaaggctgggga gatagacgta 180 gatgaagagg tgaagaaggc tttgattggc atgaagcagatgaaaatcct gatggaaaga 240 agagaggagg aacatagcaa actaatgaga acactgaagaaatgcagaga agaaaagcag 300 gaggccctga agcttatgaa tgaagttcaa gaacatctagaagaggaaga aaggctatgc 360 caggtgtctc tgatgggttc ctgggacgaa tgcaaatcttgcctggaaag tgactgcatg 420 agattttata caacctgcca aagcagttgg tcctctatgaaatccacgat tgaacgggtt 480 ttccggaaga tatatcagtt tctctttcct ttccatgaagacgatgaaaa agagcttcct 540 gttggtgaga agttcactga ggaagatgta cagctgatgcagatagagaa tgtgttcagc 600 cagctgaccg tggatgtggg atttctctat aacatgagctttcacgtctt caaacagatg 660 cagcaagaat ttgacctggc ttttcaatca tactttatgtcagacacaga ctccatggag 720 ccttactttt ttccagcttt ttccaaagag ccagcaaaaaaagcacatcc tatgcagagt 780 tgggacattc ccagcttctt ccagctgttt tgtaatttcagcctctctgt ttatcaaagt 840 gtcagcgcaa cagttacaga gatgctgaag gccattgaggacttatccaa acaagacaaa 900 gattctgccc acggtggacc gagttccacg acgtggcctgtgcggggcag agggctgtgt 960 ggagaacctg gccagaactc gtccgaatgt ctccaatttcatgcaagatg ccagaaatgt 1020 caggattacc tatgggcaga ctgccctgct gttcctgaactatacacaaa ggcggatgag 1080 gcccttgagt tggtcaacat atccaatcag cagtatgcccaggtactcca gatgacccag 1140 catcacttgg aggacaccac gtatctgatg gagaagatgagagagcagtt tggttgggta 1200 acagagctgg ccagccagac cccaggaagc gagaacatcttcagtttcat aaaggtagtt 1260 ccaggtgttc acgaaggaaa tttctccaaa caagatgaaaagatgataga cataagcatt 1320 ctgccttcct ctaatttcac actcaccatc cctcttgaagaaagtgctga gagttccgac 1380 ttcattagct acatgctggc caaagctgta cagcattttaaggaacattt taaatcttgg 1440 taagcagagt atttgattag ggacgtttgc tgataggaatagatggttct taaaagggaa 1500 aaatgacaaa actagctttt gaataccttg aaaacgtattcaacctcatt aataatcaaa 1560 ggcatgaaaa ctaagacaag ttagcagttt ttacctattgaattttcaaa ttaaaaaaaa 1620 aaatcctgat agaatgcaat gaaatgagaa ttcttatatgtgattgccag aaacaaactg 1680 gttttgtctt tttgaaaagt tattcaatta tacatatcaagagtcatcaa atttcttttt 1740 aatataataa ttccacttct ggaatcaatc caaaggagtaaatctaaaat tgaattgaag 1800 ttcccacccc aagatcaata tttgcaaatt atttaaaatagtaaactgtt aaaaactgaa 1860 tgtcatctga atgtctaaaa accagaaatg gttaaaagctgtggctaaat atgctccaaa 1920 tatcttataa aaccattaaa aatatttata aaatttaaatcatgacatga catctgctgg 1980 aacaagagtt tattctaagc ctatctataa ggcaaatattattattacta tcttccagaa 2040 aagaaacttg agactcaggg tccaagtgtt agttgctcagtcatgtctga ctctttggga 2100 ccccttggac tgtagcccac caggctcctc tgtccgtgggattcttcaga caggaatact 2160 ggggcaggtt gctatttcct tctccaggaa atcttccctatccagggatg gaacccaggt 2220 ctcctgcatt gcaggtagat gctttactat ctgagcaaccaaatgaatta ctcaagtcag 2280 tagggggtag aggcaaattt taacttagtt ttctctgaatcataattgcc acattaaact 2340 ggttcctgtt gggacatttg gttgaaaaaa ataaagtgaaaaatgagtat aaaactctat 2400 aaatgtaatg atcaaaacga aaaaaaatct acaatctgcattaaaaataa aaagggttgg 2460 cagg 2464 47 3016 DNA Bos sp. 47 cagaagctggtggcaacctc gttggtgaga gcctgcagtt agtgtcacgg cggaaacatg 60 aagccgccactcttggtgtt tattgtgtat ctgctgcggc tgagagactg tcagtgtgcg 120 cctacagggaaggaccgaac ttccatccgt gaagacccga agggtttttc caaggctggg 180 gagatagacgtagatgaaga ggtgaagaag gctttgattg gcatgaagca gatgaaaatc 240 ctgatggaaagaagagagga ggaacatagc aaactaatga gaacactgaa gaaatgcaga 300 gaagaaaagcaggaggccct gaagcttatg aatgaagttc aagaacatct agaagaggaa 360 gaaaggctatgccaggtgtc tctgatgggt tcctgggacg aatgcaaatc ttgcctggaa 420 agtgactgcatgagatttta tacaacctgc caaagcagtt ggtcctctat gaaatccacg 480 attgaacgggttttccggaa gatatatcag tttctctttc ctttccatga agacgatgaa 540 aaagagcttcctgttggtga gaagttcact gaggaagatg tacagctgat gcagatagag 600 aatgtgttcagccagctgac cgtggatgtg ggatttctct ataacatgag ctttcacgtc 660 ttcaaacagatgcagcaaga atttgacctg gcttttcaat catactttat gtcagacaca 720 gactccatggagccttactt ttttccagct ttttccaaag agccagcaaa aaaagcacat 780 cctatgcagagttgggacat tcccagcttc ttccagctgt tttgtaattt cagcctctct 840 gtttatcaaagtgtcagcgc aacagttaca gagatgctga aggccattga ggacttatcc 900 aaacaagacaaagattctgc ccacggtgga ccgagttcca cgacgtggcc tgtgcggggc 960 agagggctgtgtggagaacc tggccagaac tcgtccgaat gtctccaatt tcatgcaaga 1020 tgccagaaatgtcaggatta cctatgggca gactgccctg ctgttcctga actatacaca 1080 aaggcggatgaggcccttga gttggtcaac atatccaatc agcagtatgc ccaggtactc 1140 cagatgacccagcatcactt ggaggacacc acgtatctga tggagaagat gagagagcag 1200 tttggttgggtaacagagct ggccagccag accccaggaa gcgagaacat cttcagtttc 1260 ataaaggtagttccaggtgt tcacgaagga aatttctcca aacaagatga aaagatgata 1320 gacataagcattctgccttc ctctaatttc acactcacca tccctcttga agaaagtgct 1380 gagagttccgacttcattag ctacatgctg gccaaagctg tacagcattt taaggaacat 1440 tttaaatcttggtaagcaga gtatttgatt agggacgttt gctgatagga atagatggtt 1500 cttaaaagggaaaaatgaca aaactagctt ttgaatacct tgaaaacgta ttcaacctca 1560 ttaataatcaaaggcatgaa aactaagaca agttagcagt ttttacctat tgaattttca 1620 aattaaaaaaaaaaatcctg atagaatgca atgaaatgag aattcttata tgtgattgcc 1680 agaaacaaactggttttgtc tttttgaaaa gttattcaat tatacatatc aagagtcatc 1740 aaatttctttttaatataat aattccactt ctggaatcaa tccaaaggag taaatctaaa 1800 attgaattgaagttcccacc ccaagatcaa tatttgcaaa ttatttaaaa tagtaaactg 1860 ttaaaaactgaatgtcatct gaatgtctaa aaaccagaaa tggttaaaag ctgtggctaa 1920 atatgctccaaatatcttat aaaaccatta aaaatattta taaaatttaa atcatgacat 1980 gacatctgctggaacaagag tttattctaa gcctatctat aaggcaaata ttattattac 2040 tatcttccagaaaagaaact tgagactcag ggtccaagtg ttagttgctc agtcatgtct 2100 gactctttgagaccccttgg actgtggccc accaggctcc tctgtccatg ggattcttca 2160 gacaagaatactggagcagg ttgctatttc cttctccagg aaatcttccc tatccaggga 2220 tggaacccaggtctcctgca ttgcaggtag atgctttact atctgagcaa ccaaatgaat 2280 tactcaagtcagtagggggt agaggcaaat tttaacttag ttttctctga atcataattg 2340 ccacattaaactggttcctg ttgggacatt tggttgaaaa aaataaagtg aaaaatgagt 2400 ataaaactctataaatgtaa tgatcaaaac gaaaaaaaat ctacaatctg cattaaaaat 2460 aaaaagggttggcaggaatt acggttggaa atggatgatt ttttttaacc ttttcatctt 2520 ttgatattttacaattttct ataatgaata aataattttg agatttcaaa ttagaagata 2580 tgttgctaaaatagctaggt aaatgtagat tgaacactgt atcaatgtgt tctcatcttt 2640 aaactttagtataagtactt ctattccatg gtaatcctac agtaagacga aatgtaaatc 2700 tgttcggtctacaggaaaaa caactaaatg acatttcaga cgtacattac catctctgtt 2760 aggataatcttctgaattaa tggcacaatt agaactgtac atagtattct cctttggtaa 2820 aatggtcaatcttaaagaag cattaaatgt taattctaag ttattactca taagggacct 2880 tgtaggtaggtccctatcaa tgtataatta agctgggtat ttctagattc gctgcctctc 2940 cctttatctctgaatgttgg agaggttgtt ggtcatcaat caaccaatat ctttttagca 3000 tcttctaagtgaaggc 3016 48 2488 DNA Bos sp. CDS (71)...(1465) 48 gtgaaggtccttacagaagc tggtggcaac ctcgttggtg agagcctgca gttagtgtca 60 cggcggaaac atgaag ccg cca atc ttg gtg ttt atc gtg tat ctg ctg 109 Met Lys Pro Pro IleLeu Val Phe Ile Val Tyr Leu Leu 1 5 10 cag ctg aga gac tgt cag tgt gcgcct aca ggg aag gac cga act tcc 157 Gln Leu Arg Asp Cys Gln Cys Ala ProThr Gly Lys Asp Arg Thr Ser 15 20 25 atc cgt gaa gac ccg aag ggt ttt tccaag gct ggg gag ata gac gta 205 Ile Arg Glu Asp Pro Lys Gly Phe Ser LysAla Gly Glu Ile Asp Val 30 35 40 45 gat gaa gag gtg aag aag gct ttg attggc atg aag cag atg aaa atc 253 Asp Glu Glu Val Lys Lys Ala Leu Ile GlyMet Lys Gln Met Lys Ile 50 55 60 ctg atg gaa aga aga gag gag gaa cat agcaaa cta atg aga acc ctg 301 Leu Met Glu Arg Arg Glu Glu Glu His Ser LysLeu Met Arg Thr Leu 65 70 75 aag aaa tgc aga gaa gaa aag cag gag gcc ctgaag ctt atg aat gaa 349 Lys Lys Cys Arg Glu Glu Lys Gln Glu Ala Leu LysLeu Met Asn Glu 80 85 90 gtt caa gaa cat cta gaa gag gaa gaa agg cta tgccag gtg tct ctg 397 Val Gln Glu His Leu Glu Glu Glu Glu Arg Leu Cys GlnVal Ser Leu 95 100 105 atg ggt tcc tgg gac gaa tgc aaa tct tgc ctg gaaagt gac tgc atg 445 Met Gly Ser Trp Asp Glu Cys Lys Ser Cys Leu Glu SerAsp Cys Met 110 115 120 125 aga ttt tat aca acc tgc caa agc agt tgg tcctct atg aaa tcc acg 493 Arg Phe Tyr Thr Thr Cys Gln Ser Ser Trp Ser SerMet Lys Ser Thr 130 135 140 att gaa cgg gtt ttc cgg aag ata tat cag tttctc ttt cct ttc cat 541 Ile Glu Arg Val Phe Arg Lys Ile Tyr Gln Phe LeuPhe Pro Phe His 145 150 155 gaa gac gat gaa aaa gag ctt cct gtt ggt gagaag ttc act gag gaa 589 Glu Asp Asp Glu Lys Glu Leu Pro Val Gly Glu LysPhe Thr Glu Glu 160 165 170 gat gta cag ctg atg cag ata gag aat gtg ttcagc cag ctg acc gtg 637 Asp Val Gln Leu Met Gln Ile Glu Asn Val Phe SerGln Leu Thr Val 175 180 185 gac gtg gga ttt ctc tat aac atg agc ttt cacgtc ttc aaa cag atg 685 Asp Val Gly Phe Leu Tyr Asn Met Ser Phe His ValPhe Lys Gln Met 190 195 200 205 cag caa gaa ttt gac ctg gct ttt caa tcatac ttt atg tca gac aca 733 Gln Gln Glu Phe Asp Leu Ala Phe Gln Ser TyrPhe Met Ser Asp Thr 210 215 220 gac tcc atg gag cct tac ttt ttt cca gctttt tcc aaa gag cca gca 781 Asp Ser Met Glu Pro Tyr Phe Phe Pro Ala PheSer Lys Glu Pro Ala 225 230 235 aaa aaa gca cat cct atg cag agt tgg gacatt ccc agc ttc ttc cag 829 Lys Lys Ala His Pro Met Gln Ser Trp Asp IlePro Ser Phe Phe Gln 240 245 250 ctg ttt tgt aat ttc agc ctc tct gtt tatcaa agt gtc agc gca aca 877 Leu Phe Cys Asn Phe Ser Leu Ser Val Tyr GlnSer Val Ser Ala Thr 255 260 265 gtt aca gag atg ctg aag gcc att gag gactta tcc aaa caa gac aaa 925 Val Thr Glu Met Leu Lys Ala Ile Glu Asp LeuSer Lys Gln Asp Lys 270 275 280 285 gat tct gcc cac ggt gga ccg agt tccacg acg tgg cct gtg cgg ggc 973 Asp Ser Ala His Gly Gly Pro Ser Ser ThrThr Trp Pro Val Arg Gly 290 295 300 aga ggg ctg tgt gga gaa cct ggc cagaac tcg tcc gaa tgt ctc caa 1021 Arg Gly Leu Cys Gly Glu Pro Gly Gln AsnSer Ser Glu Cys Leu Gln 305 310 315 ttt cat gca aga tgc cag aaa tgt caggat tac cta tgg gca gac tgc 1069 Phe His Ala Arg Cys Gln Lys Cys Gln AspTyr Leu Trp Ala Asp Cys 320 325 330 cct gct gtt cct gaa cta tac aca aaggcg gat gag gcc ctt gag ttg 1117 Pro Ala Val Pro Glu Leu Tyr Thr Lys AlaAsp Glu Ala Leu Glu Leu 335 340 345 gtc aac ata tcc aat cag cag tat gcccag gta ctc cag atg acc cag 1165 Val Asn Ile Ser Asn Gln Gln Tyr Ala GlnVal Leu Gln Met Thr Gln 350 355 360 365 cat cac ttg gag gac acc acg tatctg atg gag aag atg aga gag cag 1213 His His Leu Glu Asp Thr Thr Tyr LeuMet Glu Lys Met Arg Glu Gln 370 375 380 ttt ggt tgg gta aca gag ctg gccagc cag acc cca gga agc gag aac 1261 Phe Gly Trp Val Thr Glu Leu Ala SerGln Thr Pro Gly Ser Glu Asn 385 390 395 atc ttc agt ttc ata aag gta gttcca ggt gtt cac gaa gga aat ttc 1309 Ile Phe Ser Phe Ile Lys Val Val ProGly Val His Glu Gly Asn Phe 400 405 410 tcc aaa caa gat gaa aag atg atagac ata agc att ctg cct tcc tct 1357 Ser Lys Gln Asp Glu Lys Met Ile AspIle Ser Ile Leu Pro Ser Ser 415 420 425 aat ttc aca ctc acc atc cct cttgaa gaa agt gct gag agt tcc gac 1405 Asn Phe Thr Leu Thr Ile Pro Leu GluGlu Ser Ala Glu Ser Ser Asp 430 435 440 445 ttc att agc tac atg ctg gccaaa gct gta cag cat ttt aag gaa cat 1453 Phe Ile Ser Tyr Met Leu Ala LysAla Val Gln His Phe Lys Glu His 450 455 460 ttt aaa tct tgg taagcagagtatttgattag ggacgtttgc tgataggaat 1505 Phe Lys Ser Trp 465 agatggttcttaaaagggaa aaatgacaaa actagctttt gaataccttg aaaacgtatt 1565 caacctcattaataatcaaa ggcatgaaaa ctaagacaag ttagcagttt ttacctattg 1625 aattttcaaattaaaaaaaa aatcctgata gaatgcaatg aaatgagaat tcttatatgt 1685 gattgccagaaacaaactgg ttttgtcttt ttgaaaagtt attcaattat acatatcaag 1745 agtcatcaaatttcttttta atataataat tccacttctg gaatcaatcc aaaggagtaa 1805 atctaaaattgaattgaagt tcccacccca agatcaatat ttgcaaatta tttaaaatag 1865 taaactgttaaaaactgaat gtcatctgaa tgtctaaaaa ccagaaatgg ttaaaagctg 1925 tggctaaatatgctccaaat atcttataaa accattaaaa atatttataa aatttaaatc 1985 atgacatgacatctgctgga acaagagttt attctaagcc tatctataag gcaaatatta 2045 ttattactatcttccagaaa agaaacttga gactcagggt ccaagtgtta gttgctcagt 2105 catgtctgactctttgagac cccttggact gtagcccacc aggctcctct gtccatggga 2165 ttcttcagacaagaatactg gagcaggttg ctatttcctt ctccaggaaa tcttccctat 2225 ccagggatggaacccaggtc tcctgcattg caggtagatg ctttactatc tgagcaacca 2285 aatgaattactcaagtcagt agggggtaga ggcaaatttt aacttagttt tctctgaatc 2345 ataattgccacattaaactg gttcctgttg ggacatttgg ttgaaaaaaa taaagtgaaa 2405 aatgagtataaaactctata aatgtaatga tcaaaacgaa aaaaaatcta caatctgcat 2465 taaaaataaaaagggttggc agg 2488 49 465 PRT Bos sp. 49 Met Lys Pro Pro Ile Leu ValPhe Ile Val Tyr Leu Leu Gln Leu Arg 1 5 10 15 Asp Cys Gln Cys Ala ProThr Gly Lys Asp Arg Thr Ser Ile Arg Glu 20 25 30 Asp Pro Lys Gly Phe SerLys Ala Gly Glu Ile Asp Val Asp Glu Glu 35 40 45 Val Lys Lys Ala Leu IleGly Met Lys Gln Met Lys Ile Leu Met Glu 50 55 60 Arg Arg Glu Glu Glu HisSer Lys Leu Met Arg Thr Leu Lys Lys Cys 65 70 75 80 Arg Glu Glu Lys GlnGlu Ala Leu Lys Leu Met Asn Glu Val Gln Glu 85 90 95 His Leu Glu Glu GluGlu Arg Leu Cys Gln Val Ser Leu Met Gly Ser 100 105 110 Trp Asp Glu CysLys Ser Cys Leu Glu Ser Asp Cys Met Arg Phe Tyr 115 120 125 Thr Thr CysGln Ser Ser Trp Ser Ser Met Lys Ser Thr Ile Glu Arg 130 135 140 Val PheArg Lys Ile Tyr Gln Phe Leu Phe Pro Phe His Glu Asp Asp 145 150 155 160Glu Lys Glu Leu Pro Val Gly Glu Lys Phe Thr Glu Glu Asp Val Gln 165 170175 Leu Met Gln Ile Glu Asn Val Phe Ser Gln Leu Thr Val Asp Val Gly 180185 190 Phe Leu Tyr Asn Met Ser Phe His Val Phe Lys Gln Met Gln Gln Glu195 200 205 Phe Asp Leu Ala Phe Gln Ser Tyr Phe Met Ser Asp Thr Asp SerMet 210 215 220 Glu Pro Tyr Phe Phe Pro Ala Phe Ser Lys Glu Pro Ala LysLys Ala 225 230 235 240 His Pro Met Gln Ser Trp Asp Ile Pro Ser Phe PheGln Leu Phe Cys 245 250 255 Asn Phe Ser Leu Ser Val Tyr Gln Ser Val SerAla Thr Val Thr Glu 260 265 270 Met Leu Lys Ala Ile Glu Asp Leu Ser LysGln Asp Lys Asp Ser Ala 275 280 285 His Gly Gly Pro Ser Ser Thr Thr TrpPro Val Arg Gly Arg Gly Leu 290 295 300 Cys Gly Glu Pro Gly Gln Asn SerSer Glu Cys Leu Gln Phe His Ala 305 310 315 320 Arg Cys Gln Lys Cys GlnAsp Tyr Leu Trp Ala Asp Cys Pro Ala Val 325 330 335 Pro Glu Leu Tyr ThrLys Ala Asp Glu Ala Leu Glu Leu Val Asn Ile 340 345 350 Ser Asn Gln GlnTyr Ala Gln Val Leu Gln Met Thr Gln His His Leu 355 360 365 Glu Asp ThrThr Tyr Leu Met Glu Lys Met Arg Glu Gln Phe Gly Trp 370 375 380 Val ThrGlu Leu Ala Ser Gln Thr Pro Gly Ser Glu Asn Ile Phe Ser 385 390 395 400Phe Ile Lys Val Val Pro Gly Val His Glu Gly Asn Phe Ser Lys Gln 405 410415 Asp Glu Lys Met Ile Asp Ile Ser Ile Leu Pro Ser Ser Asn Phe Thr 420425 430 Leu Thr Ile Pro Leu Glu Glu Ser Ala Glu Ser Ser Asp Phe Ile Ser435 440 445 Tyr Met Leu Ala Lys Ala Val Gln His Phe Lys Glu His Phe LysSer 450 455 460 Trp 465 50 8 PRT Homo sapiens 50 Asp Tyr Lys Asp Asp AspAsp Lys 1 5 51 446 PRT Homo sapiens 51 Ala Pro Thr Trp Lys Asp Lys ThrAla Ile Ser Glu Asn Leu Lys Ser 1 5 10 15 Phe Ser Glu Val Gly Glu IleAsp Ala Asp Glu Glu Val Lys Lys Ala 20 25 30 Leu Thr Gly Ile Lys Gln MetLys Ile Met Met Glu Arg Lys Glu Lys 35 40 45 Glu His Thr Asn Leu Met SerThr Leu Lys Lys Cys Arg Glu Glu Lys 50 55 60 Gln Glu Ala Leu Lys Leu LeuAsn Glu Val Gln Glu His Leu Glu Glu 65 70 75 80 Glu Glu Arg Leu Cys ArgGlu Ser Leu Ala Asp Ser Trp Gly Glu Cys 85 90 95 Arg Ser Cys Leu Glu AsnAsn Cys Met Arg Ile Tyr Thr Thr Cys Gln 100 105 110 Pro Ser Trp Ser SerVal Lys Asn Lys Ile Glu Arg Phe Phe Arg Lys 115 120 125 Ile Tyr Gln PheLeu Phe Pro Phe His Glu Asp Asn Glu Lys Asp Leu 130 135 140 Pro Ile SerGlu Lys Leu Ile Glu Glu Asp Ala Gln Leu Thr Gln Met 145 150 155 160 GluAsp Val Phe Ser Gln Leu Thr Val Asp Val Asn Ser Leu Phe Asn 165 170 175Arg Ser Phe Asn Val Phe Arg Gln Met Gln Gln Glu Phe Asp Gln Thr 180 185190 Phe Gln Ser His Phe Ile Ser Asp Thr Asp Leu Thr Glu Pro Tyr Phe 195200 205 Phe Pro Ala Phe Ser Lys Glu Pro Met Thr Lys Ala Asp Leu Glu Gln210 215 220 Cys Trp Asp Ile Pro Asn Phe Phe Gln Leu Phe Cys Asn Phe SerVal 225 230 235 240 Ser Ile Tyr Glu Ser Val Ser Glu Thr Ile Thr Lys MetLeu Lys Ala 245 250 255 Ile Glu Asp Leu Pro Lys Gln Asp Lys Ala Pro AspHis Gly Gly Leu 260 265 270 Ile Ser Lys Met Leu Pro Gly Gln Asp Arg GlyLeu Cys Gly Glu Leu 275 280 285 Asp Gln Asn Leu Ser Arg Cys Phe Lys PheHis Glu Lys Cys Gln Lys 290 295 300 Cys Gln Ala His Leu Ser Glu Asp CysPro Asp Val Pro Ala Leu His 305 310 315 320 Thr Glu Leu Asp Glu Ala IleArg Leu Val Asn Val Ser Asn Gln Gln 325 330 335 Tyr Gly Gln Ile Leu GlnMet Thr Arg Lys His Leu Glu Asp Thr Ala 340 345 350 Tyr Leu Val Glu LysMet Arg Gly Gln Phe Gly Trp Val Ser Glu Leu 355 360 365 Ala Asn Gln AlaPro Glu Thr Glu Ile Ile Phe Asn Ser Ile Gln Val 370 375 380 Val Pro ArgIle His Glu Gly Asn Ile Ser Lys Gln Asp Glu Thr Met 385 390 395 400 MetThr Asp Leu Ser Ile Leu Pro Ser Ser Asn Phe Thr Leu Lys Ile 405 410 415Pro Leu Glu Glu Ser Ala Glu Ser Ser Asn Phe Ile Gly Tyr Val Val 420 425430 Ala Lys Ala Leu Gln His Phe Lys Glu His Phe Lys Thr Trp 435 440 44552 44 DNA Artificial Sequence Primer 52 tttttctgaa ttcgccacca tgaaaattaaagcagagaaa aacg 44 53 69 DNA Artificial Sequence Primer 53 tttttgtcgacttatcactt gtcgtcgtcg tccttgtagt cccaggtttt aaaatgttcc 60 ttaaaatgc 6954 40 DNA Artificial Sequence Primer 54 tttttctgaa ttcaccatga ggacctgggactacagtaac 40 55 41 DNA Artificial Sequence Primer 55 tttttctctcgagaccatga aaattaaagc agagaaaaac g 41 56 47 DNA Artificial SequencePrimer 56 tttttggatc cgctgctgcc caggttttaa aatgttcctt aaaatgc 47 57 40DNA Artificial Sequence Primer 57 tttttctctc gagaccatga ggacctgggactacagtaac 40 58 37 DNA Artificial Sequence Primer 58 tttttctgaattcaccatga agccgccact cttggtg 37 59 60 DNA Artificial Sequence Primer 59tttttggatc cgctgcggcc tccgtggtca ggagcttatt tttcacagag gaccagctag 60 6036 DNA Artificial Sequence Primer 60 tttttctctc gaggactaca ggacacagctaaatcc 36 61 45 DNA Artificial Sequence Primer 61 tttttggatc cttatcaccaggttttaaaa tgttccttaa aatgc 45 62 37 DNA Artificial Sequence Primer 62tttttctgaa ttcaccatga agccgccact cttggtg 37 63 40 DNA ArtificialSequence Primer 63 tttttctctc gagaccatga ggacctggga ctacagtaac 40 64 466PRT Homo sapiens 64 Met Lys Pro Pro Leu Leu Val Phe Ile Val Cys Leu LeuTrp Leu Lys 1 5 10 15 Asp Ser His Cys Ala Pro Thr Trp Lys Asp Lys ThrAla Ile Ser Glu 20 25 30 Asn Leu Lys Ser Phe Ser Glu Val Gly Glu Ile AspAla Asp Glu Glu 35 40 45 Val Lys Lys Ala Leu Thr Gly Ile Lys Gln Met LysIle Met Met Glu 50 55 60 Arg Lys Glu Lys Glu His Thr Asn Leu Met Ser ThrLeu Lys Lys Cys 65 70 75 80 Arg Glu Glu Lys Gln Glu Ala Leu Lys Leu LeuAsn Glu Val Gln Glu 85 90 95 His Leu Glu Glu Glu Glu Arg Leu Cys Arg GluSer Leu Ala Asp Ser 100 105 110 Trp Gly Glu Cys Arg Ser Cys Leu Glu AsnAsn Cys Met Arg Ile Tyr 115 120 125 Thr Thr Cys Gln Pro Ser Trp Ser SerVal Lys Asn Lys Ile Glu Arg 130 135 140 Phe Phe Arg Lys Ile Tyr Gln PheLeu Phe Pro Phe His Glu Asp Asn 145 150 155 160 Glu Lys Asp Leu Pro IleSer Glu Lys Leu Ile Glu Glu Asp Ala Gln 165 170 175 Leu Thr Gln Met GluAsp Val Phe Ser Gln Leu Thr Val Asp Val Asn 180 185 190 Ser Leu Phe AsnArg Ser Phe Asn Val Phe Arg Gln Met Gln Gln Glu 195 200 205 Phe Asp GlnThr Phe Gln Ser His Phe Ile Ser Asp Thr Asp Leu Thr 210 215 220 Glu ProTyr Phe Phe Pro Ala Phe Ser Lys Glu Pro Met Thr Lys Ala 225 230 235 240Asp Leu Glu Gln Cys Trp Asp Ile Pro Asn Phe Phe Gln Leu Phe Cys 245 250255 Asn Phe Ser Val Ser Ile Tyr Glu Ser Val Ser Glu Thr Ile Thr Lys 260265 270 Met Leu Lys Ala Ile Glu Asp Leu Pro Lys Gln Asp Lys Ala Pro Asp275 280 285 His Gly Gly Leu Ile Ser Lys Met Leu Pro Gly Gln Asp Arg GlyLeu 290 295 300 Cys Gly Glu Leu Asp Gln Asn Leu Ser Arg Cys Phe Lys PheHis Glu 305 310 315 320 Lys Cys Gln Lys Cys Gln Ala His Leu Ser Glu AspCys Pro Asp Val 325 330 335 Pro Ala Leu His Thr Glu Leu Asp Glu Ala IleArg Leu Val Asn Val 340 345 350 Ser Asn Gln Gln Tyr Gly Gln Ile Leu GlnMet Thr Arg Lys His Leu 355 360 365 Glu Asp Thr Ala Tyr Leu Val Glu LysMet Arg Gly Gln Phe Gly Trp 370 375 380 Val Ser Glu Leu Ala Asn Gln AlaPro Glu Thr Glu Ile Ile Phe Asn 385 390 395 400 Ser Ile Gln Val Val ProArg Ile His Glu Gly Asn Ile Ser Lys Gln 405 410 415 Asp Glu Thr Met MetThr Asp Leu Ser Ile Leu Pro Ser Ser Asn Phe 420 425 430 Thr Leu Lys IlePro Leu Glu Glu Ser Ala Glu Ser Ser Asn Phe Ile 435 440 445 Gly Tyr ValVal Ala Lys Ala Leu Gln His Phe Lys Glu His Phe Lys 450 455 460 Thr Trp465 65 1607 DNA H. sapiens misc_feature (1)...(1607) N = A,T, C, or G 65tgcgtcacct gcaggcccgg gccgcggggt tggtttccac cctggaggtt gctgacaccc 60tgtgccctcg gctgacttcc agccggtggc acagacgcct ccagggggca gcactcaagc 120gcatcttagg aatgacagag ttgcgtccct ctctgttgcc aggctggagt tcagtggcat 180gttcttagct cactgaagcc tcaaattcct gggttcaagt gaccctccca cctcagcccc 240atgaggacct gggactacag gacacagcta aatccctgac acggatgaaa attaaagcag 300agaaaaacga aggtccttcc agaagctggt ggcaacttca ctggggagat attgcaaata 360acagcgggaa catgaagccg ccactcttgg tgtttattgt gtgtctgctg tggttgaaag 420acagtcactg cgcacccact tggaaggaca aaactgctat cagtgaaaac ctgaagagtt 480tttctgaggt gggggagata gatgcagatg aagaggtgaa gaaggctttg actggtatta 540agcaaatgaa aatcatgatg gaaagaaaag agaaggaaca caccaatcta atgagcaccc 600tgaagaaatg cagagaagaa aagcaggagg ccctgaaact tctgaatgaa gttcaagaac 660atctggagga agaagaaagg ctatgccggg agtctttggc agattcctgg ggtgaatgca 720ggtcttgcct ggaaaataac tgcatgagaa tttatacaac ctgccaacct agctggtcct 780ctgtgaaaaa taagctcctg accacggagg cctgatttca aagatgttac ntgggcagga 840cagaggactg tgtggggaac ttgaccagaa tttgtcaaga tgtttcaaat ttcatgaaaa 900atgccaaaaa tgtcaggctc acctatctga agactgtcct gatgtacctg ctctgcacac 960agaattagac gaggcgatca ggttggtcaa tgtatccaat cagcagtatg gccagattct 1020ccagatgacc cggaagcact tggaggacac cgcctatctg gtggagaaga tgagagggca 1080atttggctgg gtgtctgaac tggcaaacca ggccccagaa acagcaatac aggtagttcc 1140aaggattcat gaaggaaata tttccaaaca agatgaaaca atgatgacag acttaagcat 1200tctgccttcc tctaatttca cactcaagat ccctcttgaa gaaagtgctg agagttctaa 1260cttcattggc tacgtagtgg caaaagctct acagcatttt aaggaacatt ttaaaacctg 1320gtaagaagat ctaatgcatc ctatatccag taagtagaat tatctcttca tctgggacct 1380ggaaatcctg aaataaaaaa ggataatgca ataaacacag ttgcaggaaa gtatgttagc 1440tatatactat gaagtactct tagtttactt atgttgaatg gcttagctat taatactcaa 1500attgagttaa aatgaaaatt cctccttaaa aaatcaaacg taatatgtat tacatttcat 1560ggtacattag tagttctttg tatattgaat aaatactaaa tcaccta 1607 66 521 PRT Homosapiens 66 Arg His Leu Gln Ala Arg Ala Ala Gly Leu Val Ser Thr Leu GluVal 1 5 10 15 Ala Asp Thr Leu Cys Pro Arg Leu Thr Ser Ser Arg Trp HisArg Arg 20 25 30 Leu Gln Gly Ala Ala Leu Lys Arg Ile Leu Gly Met Thr GluLeu Arg 35 40 45 Pro Ser Leu Leu Pro Gly Trp Ser Ser Val Ala Cys Ser LeuThr Glu 50 55 60 Ala Ser Asn Ser Trp Val Gln Val Thr Leu Pro Pro Gln ProHis Glu 65 70 75 80 Asp Leu Gly Leu Gln Asp Thr Ala Lys Ser Leu Thr ArgMet Lys Ile 85 90 95 Lys Ala Glu Lys Asn Glu Gly Pro Ser Arg Ser Trp TrpGln Leu His 100 105 110 Trp Gly Asp Ile Ala Asn Asn Ser Gly Asn Met LysPro Pro Leu Leu 115 120 125 Val Phe Ile Val Cys Leu Leu Trp Leu Lys AspSer His Cys Ala Pro 130 135 140 Thr Trp Lys Asp Lys Thr Ala Ile Ser GluAsn Leu Lys Ser Phe Ser 145 150 155 160 Glu Val Gly Glu Ile Asp Ala AspGlu Glu Val Lys Lys Ala Leu Thr 165 170 175 Gly Ile Lys Gln Met Lys IleMet Met Glu Arg Lys Glu Lys Glu His 180 185 190 Thr Asn Leu Met Ser ThrLeu Lys Lys Cys Arg Glu Glu Lys Gln Glu 195 200 205 Ala Leu Lys Leu LeuAsn Glu Val Gln Glu His Leu Glu Glu Glu Glu 210 215 220 Arg Leu Cys ArgGlu Ser Leu Ala Asp Ser Trp Gly Glu Cys Arg Ser 225 230 235 240 Cys LeuGlu Asn Asn Cys Met Arg Ile Tyr Thr Thr Cys Gln Pro Ser 245 250 255 TrpSer Ser Val Lys Asn Lys Leu Leu Thr Thr Glu Ala Phe Gln Arg 260 265 270Cys Tyr Leu Gly Arg Thr Glu Asp Cys Val Gly Asn Leu Thr Arg Ile 275 280285 Cys Gln Asp Val Ser Asn Phe Met Lys Asn Ala Lys Asn Val Arg Leu 290295 300 Thr Tyr Leu Lys Thr Val Leu Met Tyr Leu Leu Cys Thr Gln Asn Thr305 310 315 320 Arg Arg Ser Gly Trp Ser Met Tyr Pro Ile Ser Ser Met AlaArg Phe 325 330 335 Ser Arg Pro Gly Ser Thr Trp Arg Thr Pro Pro Ile TrpTrp Arg Arg 340 345 350 Glu Gly Asn Leu Ala Gly Cys Leu Asn Trp Gln ThrArg Pro Gln Lys 355 360 365 Gln Arg Ser Ser Leu Ile Gln Tyr Arg Phe GlnGly Phe Met Lys Glu 370 375 380 Ile Phe Pro Asn Lys Met Lys Gln Gln ThrAla Phe Cys Leu Pro Leu 385 390 395 400 Ile Ser His Ser Arg Ser Leu LeuLys Lys Val Leu Arg Val Leu Thr 405 410 415 Ser Leu Ala Thr Trp Gln LysLeu Tyr Ser Ile Leu Arg Asn Ile Leu 420 425 430 Lys Pro Gly Lys Lys IleCys Ile Leu Tyr Pro Val Ser Arg Ile Ile 435 440 445 Ser Ser Ser Gly ThrTrp Lys Ser Asn Lys Lys Gly Cys Asn Lys His 450 455 460 Ser Cys Arg LysVal Cys Leu Tyr Thr Met Lys Tyr Ser Phe Thr Tyr 465 470 475 480 Val GluTrp Leu Ser Tyr Tyr Ser Asn Val Lys Met Lys Ile Pro Pro 485 490 495 LysIle Lys Arg Asn Met Tyr Tyr Ile Ser Trp Tyr Ile Ser Ser Ser 500 505 510Leu Tyr Ile Glu Ile Leu Asn His Leu 515 520 67 20 DNA ArtificialSequence Primer 67 agttgcgtcc ctctctgttg 20 68 20 DNA ArtificialSequence Primer 68 gcttcatgtt cccgctgtta 20 69 26 DNA ArtificialSequence Primer 69 acgccgcggg cccctgcggg acgggt 26 70 27 DNA ArtificialSequence Primer 70 ccatcctaat acgactcact atagggc 27 71 26 DNA ArtificialSequence Primer 71 ggagccgctg ggacgcggct tacctc 26 72 27 DNA ArtificialSequence Primer 72 ccatcctaat acgactcact atagggc 27 73 564 DNA Homosapiens misc_feature (1)...(564) n = A,T,C or G 73 ggtgtctatg ttctatcacatctacaaaca tgtcacttcc taattaacaa aatgttcttc 60 ctttagtttg cttttgcacttaaaatatat ataattgact tttttggaaa aaaatctaag 120 attcattgct ttgttttgtaaagaccaata ggttctgtat agtctttttt taaattgtgg 180 taaaatacac atggcattaatttaccattt taaccatttt aaagtgcaca atttgtggca 240 ttaagtacac tcacgttgctgtgcaaccat caccaccgtc catcttcaga acctttttat 300 cttcctaaac tgaaactctgtactcgttaa gcactcactt cccttttccc catcccccag 360 cccgtagcaa ccacgactgtactttctatg aatttgacta ctctaggtac tgcatgtagg 420 tggaatcata cagtatttgtcttttgcttg ntttgntttg ttttttgttt tctaagacag 480 ggtctcactc tgtcgccctagctggattgc agagttaagt ttatgattat gaaataaaaa 540 ctaaataacn attgtcctcgtttg 564 74 1161 DNA Homo sapiens 74 cctgaaagcc tggcgccaat gacccgcgagacattttttg cctggggtgc tcctgtcgga 60 aaggaaagag gaaaggacga ctaagaacttatactcgaac tcccgaattt ctcttttcaa 120 ggtttaagag gaaagctggt tcgtggggattggatgggag gccaccagga aaccaagttc 180 ccgcgccagc ttcagtgctc tcctcttyccgccgcctttg ccccgcccac atcactttcg 240 ctccagtttt tgaaaacgct gcgaagcggaatggtccaca ggggaaaacg gaggaggggc 300 caaagccagg actttgagac cggcgcgcggtcaagcccag gcagctctcc ctaaccctcc 360 agcactgggc aaacgctgcc cgatgacgcccgcctcgggg gccacggcat cactggggcg 420 actgcgagcc cggccgcgga gccgctgggacgcggcttac ctcccggctg tcgctgctgt 480 gtgtgttgcc cgcgccagtc acgtccctaatgggaccctc cgtttcggcg tctgtaaggc 540 gaggaggacg atgcgtcccc tccctsgcaggattgaggtt aggactaaac ggggtccgca 600 gcgcccggca gctcccgagc gctctccccagccgcgcctc cctccttccc gccacccgtc 660 ccgcaggggc ccgcggcgtc acctctcaggctgtagcgcg cctgcatgcc gaataccgac 720 agggtgccgg tgcccgtgcg gtcgtccttcctgacgccgc agcggaggat gtgttggatc 780 tgccccagga tttccaggtc ccagatgaagagataattct acttactgga tataggatgc 840 attagatctt cttaccttaa aaaaaaaaaaaaaggcagca atgatcaaaa tactaataaa 900 ttactcacag actcagtgta ttttttcttggagtaaaagt ccaggatggg taatagaata 960 cctgctgttg gcttttggaa aaattggtactgtatgtagc aaaataatgt gaaacccata 1020 tgcatggata ttcttaacaa tttgaagaaatcgtcacagc tttcctgggt tgttgagcct 1080 ctaaaatggt cttttcctct gatgtgataataaagtgttt attttgaact caaaaaaaaa 1140 aaaaaaaaaa aaaaaaaaaa a 1161 75123 PRT Homo sapiens VARIANT (1)...(123) Xaa = Any Amino Acid 75 Met ThrPro Ala Ser Gly Ala Thr Ala Ser Leu Gly Arg Leu Arg Ala 1 5 10 15 ArgPro Arg Ser Arg Trp Asp Ala Ala Tyr Leu Pro Ala Val Ala Ala 20 25 30 ValCys Val Ala Arg Ala Ser His Val Pro Asn Gly Thr Leu Arg Phe 35 40 45 GlyVal Cys Lys Ala Arg Arg Thr Met Arg Pro Leu Pro Xaa Arg Ile 50 55 60 GluVal Arg Thr Lys Arg Gly Pro Gln Arg Pro Ala Ala Pro Glu Arg 65 70 75 80Ser Pro Gln Pro Arg Leu Pro Pro Ser Arg His Pro Ser Arg Arg Gly 85 90 95Pro Arg Arg His Leu Ser Gly Cys Ser Ala Pro Ala Cys Arg Ile Pro 100 105110 Thr Gly Cys Arg Cys Pro Cys Gly Arg Pro Ser 115 120 76 105 PRT Homosapiens 76 Met Gly Pro Ser Val Ser Ala Ser Val Arg Arg Gly Gly Arg CysVal 1 5 10 15 Pro Ser Leu Ala Gly Leu Arg Leu Gln Gly Val Arg Ser AlaArg Gln 20 25 30 Leu Pro Ser Ala Leu Pro Ser Arg Ala Ser Leu Leu Pro AlaTrp Ala 35 40 45 Gly Arg Val Thr Ser Gln Ala Val Ala Arg Leu His Ala GluTyr Arg 50 55 60 Gln Gly Ala Gly Ala Arg Ala Val Val Leu Pro Asp Ala AlaAla Glu 65 70 75 80 Asp Val Leu Asp Leu Pro Gln Asp Phe Gln Val Pro AspGlu Glu Ile 85 90 95 Ile Leu Leu Thr Gly Tyr Arg Met His 100 105 77 21DNA Artificial Sequence Primer 77 aacggctgcc taacgtcctg t 21 78 20 DNAArtificial Sequence Primer 78 ggagagctgc ctgggcttga 20 79 23 DNAArtificial Sequence Primer 79 ttgaaaacgc tgcgaagcgg aat 23 80 20 DNAArtificial Sequence Primer 80 cgctacagcc tgagaggtga 20 81 23 DNAArtificial Sequence Primer 81 aggattgagg ttaggactaa acg 23 82 20 DNAArtificial Sequence Primer 82 tggcgcacgc tctctagagc 20 83 25 DNAArtificial Sequence Primer 83 ccattcaaca taagtaaact aagag 25 84 22 DNAArtificial Sequence Primer 84 gcttttgtag atgggctctt ac 22 85 24 DNAArtificial Sequence Primer 85 ggaacacacc aatctaatga gcac 24 86 28 DNAArtificial Sequence Primer 86 gttggcaggt tgtataaatt ctcatgca 28 87 30DNA Homo sapiens 87 aggctatgcc gggagtcttt ggcagattcc 30 88 19 DNAArtificial Sequence Primer 88 gaaggtgaag gtcggagtc 19 89 20 DNAArtificial Sequence Primer 89 gaagatggtg atgggatttc 20 90 20 DNA Homosapiens 90 caagcttccc gttctcagcc 20 91 25 DNA Artificial Sequence Primer91 ctgagtggag aagatgagag aggca 25 92 26 DNA Artificial Sequence Primer92 tttaaaagtg cttccttaaa atgctg 26 93 26 DNA Artificial Sequence Primer93 tttaaaagtg cttccttaaa gtgctg 26 94 26 DNA Artificial Sequence Primer94 gatgagagag gcaagtttgg ctgggt 26 95 26 DNA Artificial Sequence Primer95 gatgagagag gcaagtttgg ttgggt 26 96 25 DNA Artificial Sequence Primer96 gagtgtgaaa gttagaggaa ggcag 25 97 65 DNA Artificial Sequence Primer97 cacaccagta gacccacaca gccaccatcg atgcggccgc ggatccattt tttttttttt 60ttttt 65 98 24 DNA Artificial Sequence Primer 98 tgggtgtctc aactggcaagccat 24 99 24 DNA Artificial Sequence Primer 99 cacaccagta gacccacacagcca 24 100 24 DNA Artificial Sequence Primer 100 cataacccag tgactgaggacatc 24 101 24 DNA Artificial Sequence Primer 101 accatcgatg cggccgcggatcca 24 102 29 DNA Artificial Sequence Primer 102 cagatctgct gcagcctcacagggaagga 29 103 29 DNA Artificial Sequence Primer 103 cagatctgctgcagcctcac atggaagga 29 104 29 DNA Artificial Sequence Primer 104cagatctgct gcagcctcac ttggaagga 29 105 29 DNA Artificial Sequence Primer105 cagatctgct gcagcctcac tgggaagga 29 106 24 DNA Artificial SequencePrimer 106 ctgcttggaa gaatctcctc catg 24 107 45 DNA Artificial SequencePrimer 107 tgtaaaacga cggccagtgc ggcacgaggc acatcgtaaa aagtg 45 108 42DNA Artificial Sequence Primer 108 caggaaacag ctatgacccc taccctctcaacaaagcttt cc 42 109 117 DNA Rattus 109 gtctcaactg gcaagccata accagtgactgaggacatct ttaattcaac aaaggcagtt 60 ccaaagattc atggaggaga ttcttccaagcaggatgaaa ttatggtaga ctcaagc 117 110 39 PRT Rattus 110 Ser Gln Leu AlaSer His Asn Pro Val Thr Glu Asp Ile Phe Asn Ser 1 5 10 15 Thr Lys AlaVal Pro Lys Ile His Gly Gly Asp Ser Ser Lys Gln Asp 20 25 30 Glu Ile MetVal Asp Ser Ser 35 111 289 DNA Rattus 111 cataacccag tgactgaggacatctttaat tcaacaaagg cagttccaaa gattcatgga 60 ggagattctt ccaagcaggatgaaattatg gtagactcaa gcagcattct gccttcctct 120 aacttcaccg tccagaatcctcctgaagaa ggtgctgaga gctcaaatgt tatttactac 180 atggcagcta aagttctgcagcatctaaag ggatgttttg aaacttggta agaatagctg 240 attaggaaag ctttgttgagagggtaggta acataaaaaa aaaaaaaaa 289 112 92 PRT Rattus 112 His Asn ProVal Thr Glu Asp Ile Phe Asn Ser Thr Lys Ala Val Pro 1 5 10 15 Lys IleHis Gly Gly Asp Ser Ser Lys Gln Asp Glu Ile Met Val Asp 20 25 30 Ser SerSer Ile Leu Pro Ser Ser Asn Phe Thr Val Gln Asn Pro Pro 35 40 45 Glu GluGly Ala Glu Ser Ser Asn Val Ile Tyr Tyr Met Ala Ala Lys 50 55 60 Val LeuGln His Leu Lys Gly Cys Phe Glu Thr Trp Glu Leu Ile Arg 65 70 75 80 LysAla Leu Leu Arg Gly Val Thr Lys Lys Lys Lys 85 90 113 1120 DNA Rattus113 cccttcactg cgcgcccact gggaaggaga cagatgctac ggatggaaac ctaaagagtc 60ttccagaggt aggagaggca gatgtagagg gagaggtcaa gaaggctttg attggcatta 120agcaaatgaa aatcatgatg gaaaggagag aggaggaaca cgcaaaattg atgaaagcct 180tgaagaagtg caaagaagaa aagcaggagg cccagaaact catgaacgaa gtgcaagaac 240gtctggagga agaagaaaag ctatgtcagg catcttctat aggttcttgg gatggatgca 300ggccatgttt ggaaagtaac tgcatacgat tttatacagc ttgccaacct ggttggtcct 360ctgtgaaaag catgatgaag caatttctca agaagatata ccgatttctg tcttcccaga 420gtgaagatgt aaaggatccc cctgccatag aacagctgac taaggaagat ttacaagtgg 480tacacataga gaacctgttt agccagctgg ccgtggatgc aaaatctctc ttcaacatga 540gcttttacat ttttaagcag atgcagcaag aatttgatca ggcttttcaa ttatacttca 600tgtccgatgt ggacttaatg gagccatacc ccccagcttt atctaaagag ataatcaaaa 660aagaagaact tgggcaaagg tggggcattc ccaatgtctt ccagctgttt cataatttca 720gtctctctgt ttatgggaga gtccaacaaa taataatgaa gacactcaat gcaattgaag 780attcatggga accacacaaa gagttagacc agagaggtat gacttcagag atgttacctg 840agcaaaatgg agaaatgtgt gaggaatttg tcaagaattt atctggatgt ttaaaatttc 900gtaaaagatg ccaaaaatgt cacaattacc tatctgaaga atgccctgat gtacctgaac 960ttcacataga attccttgag gccctgaaat tagtcaatgt atccaatcag caatatgatc 1020agattgtcca gatgacccag tatcatttgg aagataccat atacctgatg gagaaaatgc 1080aagagcagtt tggatgggtg tctcaactgg caagccataa 1120 114 397 PRT Rattus 114Leu His Cys Ala Pro Thr Gly Lys Glu Thr Asp Ala Thr Asp Gly Asn 1 5 1015 Leu Lys Ser Leu Pro Glu Val Gly Glu Ala Asp Val Glu Gly Glu Val 20 2530 Lys Lys Ala Leu Ile Gly Ile Lys Gln Met Lys Ile Met Met Glu Arg 35 4045 Arg Glu Glu Glu His Ala Lys Leu Met Lys Ala Leu Lys Lys Cys Lys 50 5560 Glu Glu Lys Gln Glu Ala Gln Lys Leu Met Asn Glu Val Gln Glu Arg 65 7075 80 Leu Glu Glu Glu Glu Lys Leu Cys Gln Ala Ser Ser Ile Gly Ser Trp 8590 95 Asp Gly Cys Arg Pro Cys Leu Glu Ser Asn Cys Ile Arg Phe Tyr Thr100 105 110 Ala Cys Gln Pro Gly Trp Ser Ser Val Lys Ser Met Met Lys GlnPhe 115 120 125 Leu Lys Lys Ile Tyr Arg Phe Leu Ser Ser Gln Ser Glu AspVal Lys 130 135 140 Asp Pro Pro Ala Ile Glu Gln Leu Thr Lys Glu Asp LeuGln Val Val 145 150 155 160 His Ile Glu Asn Leu Phe Ser Gln Leu Ala ValAsp Ala Lys Ser Leu 165 170 175 Phe Asn Met Ser Phe Tyr Ile Phe Lys GlnMet Gln Gln Glu Phe Asp 180 185 190 Gln Ala Phe Gln Leu Tyr Phe Met SerAsp Val Asp Leu Met Glu Pro 195 200 205 Tyr Pro Pro Ala Leu Ser Lys GluIle Ile Lys Lys Glu Glu Leu Gly 210 215 220 Gln Arg Trp Gly Ile Pro AsnVal Phe Gln Leu Phe His Asn Phe Ser 225 230 235 240 Leu Ser Val Tyr GlyArg Val Gln Gln Ile Ile Met Lys Thr Leu Asn 245 250 255 Ala Ile Glu AspSer Trp Glu Pro His Lys Glu Leu Asp Gln Arg Gly 260 265 270 Met Thr SerGlu Met Leu Pro Glu Gln Asn Gly Glu Met Cys Glu Glu 275 280 285 Phe ValLys Asn Leu Ser Gly Cys Leu Lys Phe Arg Lys Arg Cys Gln 290 295 300 LysCys His Asn Tyr Leu Ser Glu Glu Cys Pro Asp Val Pro Glu Leu 305 310 315320 His Ile Glu Phe Leu Glu Ala Leu Lys Leu Val Asn Val Ser Asn Gln 325330 335 Gln Tyr Asp Gln Ile Val Gln Met Thr Gln Tyr His Leu Glu Asp Thr340 345 350 Ile Tyr Leu Met Glu Lys Met Gln Glu Gln Phe Gly Trp Val SerGln 355 360 365 Leu Ala Ser His Asn Pro Val Thr Glu Asp Ile Phe Asn SerThr Lys 370 375 380 Ala Val Pro Lys Ile His Gly Gly Asp Ser Ser Lys Gln385 390 395 115 341 DNA Rattus 115 tttttttttt tttttttcaa ggctttcatcaattttgcgt gttcctcctc tctcctttcc 60 atcatgattt tcatttgctt aatgccaatcaaagccttct tgacctctcc ctctacatct 120 gcctctccta cctctggaag actctttaggtttccatccg tagcatctgt ctccttccaa 180 gtaggtgcac tgtcacaata tttcaaccataacagataca cagaaatcac aaagagtggt 240 ggctgcatgg tccagtgttc caccgatattgcagctctcc ccagagaaat tgccactaac 300 ttctgaaagg accttcactt tttacgatgtgcctcgtgcc g 341 116 341 DNA Rattus 116 cggcacgagg cacatcgtaa aaagtgaaggtcctttcaga agttagtggc aatttctctg 60 gggagagctg caatatcggt ggaacactggaccatgcagc caccactctt tgtgatttct 120 gtgtatctgt tatggttgaa atattgtgacagtgcaccta cttggaagga gacagatgct 180 acggatggaa acctaaagag tcttccagaggtaggagagg cagatgtaga gggagaggtc 240 aagaaggctt tgattggcat taagcaaatgaaaatcatga tggaaaggag agaggaggaa 300 cacgcaaaat tgatgaaagc cttgaaaaaaaaaaaaaaaa a 341 117 112 PRT Rattus 117 Arg His Glu Ala His Arg Lys LysArg Ser Phe Gln Lys Leu Val Ala 1 5 10 15 Ile Ser Leu Gly Arg Ala AlaIle Ser Val Glu His Trp Thr Met Gln 20 25 30 Pro Pro Leu Phe Val Ile SerVal Tyr Leu Leu Trp Leu Lys Tyr Cys 35 40 45 Asp Ser Ala Pro Thr Trp LysGlu Thr Asp Ala Thr Asp Gly Asn Leu 50 55 60 Lys Ser Leu Pro Glu Val GlyGlu Ala Asp Val Glu Gly Glu Val Lys 65 70 75 80 Lys Ala Leu Ile Gly IleLys Gln Met Lys Ile Met Met Glu Arg Arg 85 90 95 Glu Glu Glu His Ala LysLeu Met Lys Ala Leu Lys Lys Lys Lys Lys 100 105 110 118 56 PRT Rattus118 Thr Asp Ala Thr Asp Gly Asn Leu Lys Ser Leu Pro Glu Val Gly Glu 1 510 15 Ala Asp Val Glu Gly Glu Val Lys Lys Ala Leu Ile Gly Ile Lys Gln 2025 30 Met Lys Ile Met Met Glu Arg Arg Glu Glu Glu His Ala Lys Leu Met 3540 45 Lys Ala Leu Lys Lys Lys Lys Lys 50 55 119 1545 DNA Rattus 119ggcaccgagg cacatcgtaa aaagtgaagg tcctttcaga agttagtggc aatttctctg 60gggagagctg caatatcggt ggaacactgg accatgcagc caccactctt tgtgatttct 120gtgtatctgt tatggtgaaa tattgtgaca gtgcacctac ttggaaggag acagatgcta 180cggatggaaa cctaaagagt cttccagagg taggagaggc agatgtagag ggagaggtca 240agaaggcttt gattggcatt aagcaaatga aaatcatgat ggaaaggaga gaggaggaac 300acgcaaaatt gatgaaagcc ttgaagaagt gcaaagaaga aaagcaggag gcccagaaac 360tcatgaacga agtgcaagaa cgtctggagg aagaagaaaa gctatgtcag gcatcttcta 420taggttcttg ggatggatgc aggccatgtt tggaaagtaa ctgcatacga ttttatacag 480cttgccaacc tggttggtcc tctgtgaaaa gcatgatgaa gcaatttctc aagaagatat 540accgatttct gtcttcccag agtgaagatg taaaggatcc ccctgccata gaacagctga 600ctaaggaaga tttacaagtg gtacacatag agaacctgtt tagccagctg gccgtggatg 660caaaatctct cttcaacatg agcttttaca tttttaagca gatgcagcaa gaatttgatc 720aggcttttca attatacttc atgtccgatg tggacttaat ggagccatac cccccagctt 780tatctaaaga gataatcaaa aaagaagaac ttgggcaaag gtggggcatt cccaatgtct 840tccagctgtt tcataatttc agtctctctg tttatgggag agtccaacaa ataataatga 900agacactcaa tgcaattgaa gattcatggg aaccacacaa agagttagac cagagaggta 960tgacttcaga gatgttacct gagcaaaatg gagaaatgtg tgaggaattt gtcaagaatt 1020tatctggatg tttaaaattt cgtaaaagat gccaaaaatg tcacaattac ctatctgaag 1080aatgccctga tgtacctgaa cttcacatag aattccttga ggccctgaaa ttagtcaatg 1140tatccaatca gcaatatgat cagattgtcc agatgaccca gtatcatttg gaagatacca 1200tatacctgat ggagaaaatg caagagcagt ttggatgggt gtctcaactg gcaagccata 1260acccagtgac tgaggacatc tttaattcaa caaaggcagt tccaaagatt catggaggag 1320attcttccaa gcaggatgaa attatggtag actcaagcag cattctgcct tcctctaact 1380tcaccgtcca gaatcctcct gaagaaggtg ctgagagctc aaatgttatt tactacatgg 1440cagctaaagt tctgcagcat ctaaagggat gttttgaaac ttggtaagaa tagctgatta 1500ggaaagcttt gttgagaggg taggtaacat aaaaaaaaaa aaaaa 1545 120 512 PRTRattus 120 His Arg Gly Thr Ser Glx Lys Val Lys Val Leu Ser Glu Val SerGly 1 5 10 15 Asn Phe Ser Gly Glu Ser Cys Asn Ile Gly Gly Thr Leu AspHis Ala 20 25 30 Ala Thr Thr Leu Cys Asp Phe Cys Val Ser Val Met Val LysTyr Cys 35 40 45 Asp Ser Ala Pro Thr Trp Lys Glu Thr Asp Ala Thr Asp GlyAsn Leu 50 55 60 Lys Ser Leu Pro Glu Val Gly Glu Ala Asp Val Glu Gly GluVal Lys 65 70 75 80 Lys Ala Leu Ile Gly Ile Lys Gln Met Lys Ile Met MetGlu Arg Arg 85 90 95 Glu Glu Glu His Ala Lys Leu Met Lys Ala Leu Lys LysCys Lys Glu 100 105 110 Glu Lys Gln Glu Ala Gln Lys Leu Met Asn Glu ValGln Glu Arg Leu 115 120 125 Glu Glu Glu Glu Lys Leu Cys Gln Ala Ser SerIle Gly Ser Trp Asp 130 135 140 Gly Cys Arg Pro Cys Leu Glu Ser Asn CysIle Arg Phe Tyr Thr Ala 145 150 155 160 Cys Gln Pro Gly Trp Ser Ser ValLys Ser Met Met Lys Gln Phe Leu 165 170 175 Lys Lys Ile Tyr Arg Phe LeuSer Ser Gln Ser Glu Asp Val Lys Asp 180 185 190 Pro Pro Ala Ile Glu GlnLeu Thr Lys Glu Asp Leu Gln Val Val His 195 200 205 Ile Glu Asn Leu PheSer Gln Leu Ala Val Asp Ala Lys Ser Leu Phe 210 215 220 Asn Met Ser PheTyr Ile Phe Lys Gln Met Gln Gln Glu Phe Asp Gln 225 230 235 240 Ala PheGln Leu Tyr Phe Met Ser Asp Val Asp Leu Met Glu Pro Tyr 245 250 255 ProPro Ala Leu Ser Lys Glu Ile Ile Lys Lys Glu Glu Leu Gly Gln 260 265 270Arg Trp Gly Ile Pro Asn Val Phe Gln Leu Phe His Asn Phe Ser Leu 275 280285 Ser Val Tyr Gly Arg Val Gln Gln Ile Ile Met Lys Thr Leu Asn Ala 290295 300 Ile Glu Asp Ser Trp Glu Pro His Lys Glu Leu Asp Gln Arg Gly Met305 310 315 320 Thr Ser Glu Met Leu Pro Glu Gln Asn Gly Glu Met Cys GluGlu Phe 325 330 335 Val Lys Asn Leu Ser Gly Cys Leu Lys Phe Arg Lys ArgCys Gln Lys 340 345 350 Cys His Asn Tyr Leu Ser Glu Glu Cys Pro Asp ValPro Glu Leu His 355 360 365 Ile Glu Phe Leu Glu Ala Leu Lys Leu Val AsnVal Ser Asn Gln Gln 370 375 380 Tyr Asp Gln Ile Val Gln Met Thr Gln TyrHis Leu Glu Asp Thr Ile 385 390 395 400 Tyr Leu Met Glu Lys Met Gln GluGln Phe Gly Trp Val Ser Gln Leu 405 410 415 Ala Ser His Asn Pro Val ThrGlu Asp Ile Phe Asn Ser Thr Lys Ala 420 425 430 Val Pro Lys Ile His GlyGly Asp Ser Ser Lys Gln Asp Glu Ile Met 435 440 445 Val Asp Ser Ser SerIle Leu Pro Ser Ser Asn Phe Thr Val Gln Asn 450 455 460 Pro Pro Glu GluGly Ala Glu Ser Ser Asn Val Ile Tyr Tyr Met Ala 465 470 475 480 Ala LysVal Leu Gln His Leu Lys Gly Cys Phe Glu Thr Trp Glu Leu 485 490 495 IleArg Lys Ala Leu Leu Arg Gly Asn Val Thr Asn Lys Lys Lys Lys 500 505 510121 221 DNA Homo sapiens 121 gaattagacg aggcgatcag gttggtcaat gtatccaatcagcagtatgg ccagattctc 60 cagatgaccc ggaagcactt ggaggacacc gcctatctggtggagaagat gagagggcaa 120 tttggctggg tgtctgaact ggcaaaccag gccccagaaacagagatcat ctttaattca 180 atacaggtaa gaagatctaa tgcatcctat atccagtaag t221 122 524 DNA Homo sapiens 122 acacagaatt agacgaggcg atcaggttggtcaatgtatc caatcagcag tatggccaga 60 ttctccagat gacccggaag cacttggaggacaccgccta tctggtggag aagatgagag 120 ggcaatttgg ctgggtgtct gaactggcaaaccaggcccc agaaacagag atcatcttta 180 attcaataca ggtagttcca aggattcatgaaggaaatat ttccaaacaa gatgaaacaa 240 tgatgacaga cttaagcatt ctgccttcctctaatttcac actcaagatc cctcttgaag 300 aaagtgctga gagttctaac ttcattggctacgtagtggc aaaagctcta cagcatttta 360 aggaacattt taaaacctgg taagcagagtgcctggttag gaatgccttg ttgacaggaa 420 tagttaattc tcaaaaggga aaaacaaaacttgtttcaaa atacctggaa aacatgttta 480 acctcattaa taaagacatg aaaacaaacaagatggcatt ttct 524 123 568 DNA Homo sapiens 123 gaattagacg aggcgatcaggttggtcaat gtatccaatc agcagtatgg ccagattctc 60 cagatgaccc ggaagcacttggaggacacc gcctatctgg tggagaagat gagagggcaa 120 tttggctggg tgtctgaactggcaaaccag gccccagaaa cagagatcat ctttaattca 180 atacaggtag ttccaaggattcatgaagga aatatttcca aacaagatga aacaatgatg 240 acagacttaa gcattctgccttcctctaat ttcacactca agatccctct tgaagaaagt 300 gctgagagtt ctaacttcattggctacgta gtggcaaaag ctctacagca ttttaaggaa 360 cattttaaaa cctgaaaaagatcctgaggc tcagtgtcca aggtccaatg aactactcag 420 gtcggaggtg gtagagcagcatgtggagcc agttctctct ccgactccat catcacactg 480 cacggcttcc tgttaagatatttgctcaaa aaatgcgaga tataaaaatc tgggtaagaa 540 gatctaatgc atcctatatccagtaagt 568 124 1141 DNA H. sapiens misc_feature (789)...(798)additional sequence present in full genomic sequence 124 cctgaaagcctggcgccaat gacccgcgag acattttttg cctggggtgc tcctgtcgga 60 aaggaaagaggaaaggacga ctaagaactt atactcgaac tcccgaattt ctcttttcaa 120 ggtttaagaggaaagctggt tcgtggggat tggatgggag gccaccagga aaccaagttc 180 ccgcgccagcttcagtgctc tcctcttycc gccgcctttg ccccgcccac atcactttcg 240 ctccagtttttgaaaacgct gcgaagcgga atggtccaca ggggaaaacg gaggaggggc 300 caaagccaggactttgagac cggcgcgcgg tcaagcccag gcagctctcc ctaaccctcc 360 agcactgggcaaacgctgcc cgatgacgcc cgcctcgggg gccacggcat cactggggcg 420 actgcgagcccggccgcgga gccgctggga cgcggcttac ctcccggctg tcgctgctgt 480 gtgtgttgcccgcgccagtc acgtccctaa tgggaccctc cgtttcggcg tctgtaaggc 540 gaggaggacgatgcgtcccc tccctsgcag gattgaggtt aggactaaac ggggtccgca 600 gcgcccggcagctcccgagc gctctcccca gccgcgcctc cctccttccc gccacccgtc 660 ccgcaggggcccgcggcgtc acctctcagg ctgtagcgcg cctgcatgcc gaataccgac 720 agggtgccggtgcccgtgcg gtcgtccttc ctgacgccgc agcggaggat gtgttggatc 780 tgccccaggtactttcagga tttccaggtc ccagatgaag agataattct acttactgga 840 tataggatgcattagatctt cttaccttaa aaaaaaaaaa aaaggcagca atgatcaaaa 900 tactaataaattactcacag actcagtgta ttttttcttg gagtaaaagt ccaggatggg 960 taatagaatacctgctgttg gcttttggaa aaattggtac tgtatgtagc aaaataatgt 1020 gaaacccatatgcatggata ttcttaacaa tttgaagaaa tcgtcacagc tttcctgggt 1080 tgttgagcctctaaaatggt cttttcctct gatgtgataa taaagtgttt attttgaact 1140 c 1141 12527 PRT Homo sapiens 125 Cys Arg Glu Ser Leu Ala Asp Ser Trp Gly Glu CysArg Ser Cys Leu 1 5 10 15 Glu Asn Asn Cys Met Arg Ile Tyr Thr Thr Cys 2025 126 29 PRT Homo sapiens 126 Gly Glu Leu Asp Gln Asn Leu Ser Arg CysPhe Lys Phe His Glu Lys 1 5 10 15 Cys Gln Lys Cys Gln Ala His Leu SerGlu Asp Cys Pro 20 25 127 27 PRT Cavia sp. 127 Cys Gln Val Ser Leu AlaAsp Ser Trp Asp Glu Cys Arg Ala Cys Leu 1 5 10 15 Glu Ser Asn Cys MetArg Phe Asp Thr Thr Cys 20 25 128 30 PRT Cavia sp. 128 Asp Gly Lys LeuGly Gln Asn Leu Ser Asp Cys Val Asn Phe Arg Lys 1 5 10 15 Arg Cys GlnLys Cys Gln Asp Tyr Leu Ser Asp Asp Cys Pro 20 25 30 129 27 PRT Bos sp.129 Cys Gln Val Ser Leu Met Gly Ser Trp Asp Glu Cys Lys Ser Cys Leu 1 510 15 Glu Ser Asp Cys Met Arg Phe Tyr Thr Thr Cys 20 25 130 29 PRT Bossp. 130 Leu Cys Gly Glu Pro Gly Gln Asn Ser Ser Glu Cys Leu Gln Phe His1 5 10 15 Ala Arg Cys Gln Lys Cys Gln Asp Tyr Leu Trp Ala Asp 20 25 13130 PRT Homo sapiens 131 Cys Arg Glu Ser Leu Ala Asp Ser Trp Gly Glu CysArg Ser Cys Leu 1 5 10 15 Glu Asn Asn Cys Met Arg Ile Tyr Thr Thr CysCys Gly Glu 20 25 30 132 9 PRT Homo sapiens 132 Arg Arg Ser Asn Ala SerTyr Ile Gln 1 5 133 494 PRT Homo sapiens 133 Met Lys Ile Lys Ala Glu LysAsn Glu Gly Pro Ser Arg Ser Trp Trp 1 5 10 15 Gln Leu His Trp Gly AspIle Ala Asn Asn Ser Gly Asn Met Lys Pro 20 25 30 Pro Leu Leu Val Phe IleVal Cys Leu Leu Trp Leu Lys Asp Ser His 35 40 45 Cys Ala Pro Thr Trp LysAsp Lys Thr Ala Ile Ser Glu Asn Leu Lys 50 55 60 Ser Phe Ser Glu Val GlyGlu Ile Asp Ala Asp Glu Glu Val Lys Lys 65 70 75 80 Ala Leu Thr Gly IleLys Gln Met Lys Ile Met Met Glu Arg Lys Glu 85 90 95 Lys Glu His Thr AsnLeu Met Ser Thr Leu Lys Lys Cys Arg Glu Glu 100 105 110 Lys Gln Glu AlaLeu Lys Leu Leu Asn Glu Val Gln Glu His Leu Glu 115 120 125 Glu Glu GluArg Leu Cys Arg Glu Ser Leu Ala Asp Ser Trp Gly Glu 130 135 140 Cys ArgSer Cys Leu Glu Asn Asn Cys Met Arg Ile Tyr Thr Thr Cys 145 150 155 160Gln Pro Ser Trp Ser Ser Val Lys Asn Lys Ile Glu Arg Phe Phe Arg 165 170175 Lys Ile Tyr Gln Phe Leu Phe Pro Phe His Glu Asp Asn Glu Lys Asp 180185 190 Leu Pro Ile Ser Glu Lys Leu Ile Glu Glu Asp Ala Gln Leu Thr Gln195 200 205 Met Glu Asp Val Phe Ser Gln Leu Thr Val Asp Val Asn Ser LeuPhe 210 215 220 Asn Arg Ser Phe Asn Val Phe Arg Gln Met Gln Gln Glu PheAsp Gln 225 230 235 240 Thr Phe Gln Ser His Phe Ile Ser Asp Thr Asp LeuThr Glu Pro Tyr 245 250 255 Phe Phe Pro Ala Phe Ser Lys Glu Pro Met ThrLys Ala Asp Leu Glu 260 265 270 Gln Cys Trp Asp Ile Pro Asn Phe Phe GlnLeu Phe Cys Asn Phe Ser 275 280 285 Val Ser Ile Tyr Glu Ser Val Ser GluThr Ile Thr Lys Met Leu Lys 290 295 300 Ala Ile Glu Asp Leu Pro Lys GlnAsp Lys Ala Pro Asp His Gly Gly 305 310 315 320 Leu Ile Ser Lys Met LeuPro Gly Gln Asp Arg Gly Leu Cys Gly Glu 325 330 335 Leu Asp Gln Asn LeuSer Arg Cys Phe Lys Phe His Glu Lys Cys Gln 340 345 350 Lys Cys Gln AlaHis Leu Ser Glu Asp Cys Pro Asp Val Pro Ala Leu 355 360 365 His Thr GluLeu Asp Glu Ala Ile Arg Leu Val Asn Val Ser Asn Gln 370 375 380 Gln TyrGly Gln Ile Leu Gln Met Thr Arg Lys His Leu Glu Asp Thr 385 390 395 400Ala Tyr Leu Val Glu Lys Met Arg Gly Gln Phe Gly Trp Val Ser Glu 405 410415 Leu Ala Asn Gln Ala Pro Glu Thr Glu Ile Ile Phe Asn Ser Ile Gln 420425 430 Val Val Pro Arg Ile His Glu Gly Asn Ile Ser Lys Gln Asp Glu Thr435 440 445 Met Met Thr Asp Leu Ser Ile Leu Pro Ser Ser Asn Phe Thr LeuLys 450 455 460 Ile Pro Leu Glu Glu Ser Ala Glu Ser Ser Asn Phe Ile GlyTyr Val 465 470 475 480 Val Ala Lys Ala Leu Gln His Phe Lys Glu His PheLys Thr 485 490 134 1541 DNA Rattus 134 aaaacgacgg ccagtgcggc acgaggcacatcgtaaaaag tgaaggtcct ttcagaagtt 60 agtggcaatt tctctgggga gagctgcaatatcggtggaa cactggacca tgcagccacc 120 actctttgtg atttctgtgt atctgttatggttgaaatat tgtgacagtg cacctacttg 180 gaaggagaca gatgctacgg atggaaacctaaagagtctt ccagaggtag gagaggcaga 240 tgtagaggga gaggtcaaga aggctttgattggcattaag caaatgaaaa tcatgatgga 300 aaggagagag gaggaacacg caaaattgatgaaagccttg aagaagtgca aagaagaaaa 360 gcaggaggcc cagaaactca tgaacgaagtgcaagaacgt ctggaggaag aagaaaagct 420 atgtcaggca tcttctatag gttcttgggatggatgcagg ccatgtttgg aaagtaactg 480 catacgattt tatacagctt gccaacctggttggtcctct gtgaaaagca tgatgaagca 540 atttctcaag aagatatacc gatttctgtcttcccagagt gaagatgtaa aggatccccc 600 tgccatagaa cagctgacta aggaagatttacaagtggta cacatagaga acctgtttag 660 ccagctggcc gtggatgcaa aatctctcttcaacatgagc ttttacattt ttaagcagat 720 gcagcaagaa tttgatcagg cttttcaattatacttcatg tccgatgtgg acttaatgga 780 gccatacccc ccagctttat ctaaagagataatcaaaaaa gaagaacttg ggcaaaggtg 840 gggcattccc aatgtcttcc agctgtttcataatttcagt ctctctgttt atgggagagt 900 ccaacaaata ataatgaaga cactcaatgcaattgaagat tcatgggaac cacacaaaga 960 gttagaccag agaggtatga cttcagagatgttacctgag caaaatggag aaatgtgtga 1020 ggaatttgtc aagaatttat ctggatgtttaaaatttcgt aaaagatgcc aaaaatgtca 1080 caattaccta tctgaagaat gccctgatgtacctgaactt cacatagaat tccttgaggc 1140 cctgaaatta gtcaatgtat ccaatcagcaatatgatcag attgtccaga tgacccagta 1200 tcatttggaa gataccatat acctgatggagaaaatgcaa gagcagtttg gatgggtgtc 1260 tcaactggca agccataacc cagtgactgaggacatcttt aattcaacaa aggcagttcc 1320 aaagattcat ggaggagatt cttccaagcaggatgaaatt atggtagact caagcagcat 1380 tctgccttcc tctaacttca ccgtccagaatcctcctgaa gaaggtgctg agagctcaaa 1440 tgttatttac tacatggcag ctaaagttctgcagcatcta aagggatgtt ttgaaacttg 1500 gtaagaatag ctgattagga aagctttgttgagagggtag g 1541 135 464 PRT Rattus 135 Met Gln Pro Pro Leu Phe Val IleSer Val Tyr Leu Leu Trp Leu Lys 1 5 10 15 Tyr Cys Asp Ser Ala Pro ThrTrp Lys Glu Thr Asp Ala Thr Asp Gly 20 25 30 Asn Leu Lys Ser Leu Pro GluVal Gly Glu Ala Asp Val Glu Gly Glu 35 40 45 Val Lys Lys Ala Leu Ile GlyIle Lys Gln Met Lys Ile Met Met Glu 50 55 60 Arg Arg Glu Glu Glu His AlaLys Leu Met Lys Ala Leu Lys Lys Cys 65 70 75 80 Lys Glu Glu Lys Gln GluAla Gln Lys Leu Met Asn Glu Val Gln Glu 85 90 95 Arg Leu Glu Glu Glu GluLys Leu Cys Gln Ala Ser Ser Ile Gly Ser 100 105 110 Trp Asp Gly Cys ArgPro Cys Leu Glu Ser Asn Cys Ile Arg Phe Tyr 115 120 125 Thr Ala Cys GlnPro Gly Trp Ser Ser Val Lys Ser Met Met Lys Gln 130 135 140 Phe Leu LysLys Ile Tyr Arg Phe Leu Ser Ser Gln Ser Glu Asp Val 145 150 155 160 LysAsp Pro Pro Ala Ile Glu Gln Leu Thr Lys Glu Asp Leu Gln Val 165 170 175Val His Ile Glu Asn Leu Phe Ser Gln Leu Ala Val Asp Ala Lys Ser 180 185190 Leu Phe Asn Met Ser Phe Tyr Ile Phe Lys Gln Met Gln Gln Glu Phe 195200 205 Asp Gln Ala Phe Gln Leu Tyr Phe Met Ser Asp Val Asp Leu Met Glu210 215 220 Pro Tyr Pro Pro Ala Leu Ser Lys Glu Ile Ile Lys Lys Glu GluLeu 225 230 235 240 Gly Gln Arg Trp Gly Ile Pro Asn Val Phe Gln Leu PheHis Asn Phe 245 250 255 Ser Leu Ser Val Tyr Gly Arg Val Gln Gln Ile IleMet Lys Thr Leu 260 265 270 Asn Ala Ile Glu Asp Ser Trp Glu Pro His LysGlu Leu Asp Gln Arg 275 280 285 Gly Met Thr Ser Glu Met Leu Pro Glu GlnAsn Gly Glu Met Cys Glu 290 295 300 Glu Phe Val Lys Asn Leu Ser Gly CysLeu Lys Phe Arg Lys Arg Cys 305 310 315 320 Gln Lys Cys His Asn Tyr LeuSer Glu Glu Cys Pro Asp Val Pro Glu 325 330 335 Leu His Ile Glu Phe LeuGlu Ala Leu Lys Leu Val Asn Val Ser Asn 340 345 350 Gln Gln Tyr Asp GlnIle Val Gln Met Thr Gln Tyr His Leu Glu Asp 355 360 365 Thr Ile Tyr LeuMet Glu Lys Met Gln Glu Gln Phe Gly Trp Val Ser 370 375 380 Gln Leu AlaSer His Asn Pro Val Thr Glu Asp Ile Phe Asn Ser Thr 385 390 395 400 LysAla Val Pro Lys Ile His Gly Gly Asp Ser Ser Lys Gln Asp Glu 405 410 415Ile Met Val Asp Ser Ser Ser Ile Leu Pro Ser Ser Asn Phe Thr Val 420 425430 Gln Asn Pro Pro Glu Glu Gly Ala Glu Ser Ser Asn Val Ile Tyr Tyr 435440 445 Met Ala Ala Lys Val Leu Gln His Leu Lys Gly Cys Phe Glu Thr Trp450 455 460 136 1541 DNA Rattus 136 aaaacgacgg ccagtgcggc acgaggcacatcgtaaaaag tgaaggtcct ttcagaagtt 60 agtggcaatt tctctgggga gagctgcaatatcggtggaa cactggacca tgcagccacc 120 actctttgtg atttctgtgt atctgttatggttgaaatat tgtgacagtg cacctacttg 180 gaaggagaca gatgctacgg atggaaacctaaagagtctt ccagaggtag gagaggcaga 240 tgtagaggga gaggtcaaga aggctttgattggcattaag caaatgaaaa tcatgatgga 300 aaggagagag gaggaacacg caaaattgatgaaagccttg aagaagtgca aagaagaaaa 360 gcaggaggcc cagaaactca tgaacgaagtgcaagaacgt ctggaggaag aagaaaagct 420 atgtcaggca tcttctatag gttcttgggatggatgcagg ccatgtttgg aaagtaactg 480 catacgattt tatacagctt gccaacctggttggtcctct gtgaaaagca tgatgaagca 540 atttctcaag aagatatacc gatttctgtcttcccagagt gaagatgtaa aggatccccc 600 tgccatagaa cagctgacta aggaagatttacaagtggta cacatagaga acctgtttag 660 ccagctggcc gtggatgcaa aatctctcttcaacatgagc ttttacattt ttaagcagat 720 gcagcaagaa tttgatcagg cttttcaattatacttcatg tccgatgtgg acttaatgga 780 gccatacccc ccagctttat ctaaagagataatcaaaaaa gaagaacttg ggcaaaggtg 840 gggcattccc aatgtcttcc agctgtttcataatttcagt ctctctgttt atgggagagt 900 ccaacaaata ataatgaaga cactcaatgcaattgaagat tcatgggaac cacacaaaga 960 gttagaccag agaggtatga cttcagagatgttacctgag caaaatggag aaatgtgtga 1020 ggaatttgtc aagaatttat ctggatgtttaaaatttcgt aaaagatgcc aaaaatgtca 1080 caattaccta tctgaagaat gccctgatgtacctgaactt cacatagaat tccttgaggc 1140 cctgaaatta gtcaatgtat ccaatcagcaatatgatcag attgtccaga tgacccagta 1200 tcatttggaa gataccatat acctgatggagaaaatgcaa gagcagtttg gatgggtgtc 1260 tcaactggca agccataacc cagtgactgaggacatcttt aattcaacaa aggcagttcc 1320 aaagattcat ggaggagatt cttccaagcaggatgaaatt atggtagact caagcagcat 1380 tctgccttcc tctaacttca ccgtccagaatcctcctgaa gaaggtgctg agagctcaaa 1440 tgttatttac tacatggcag ctaaagttctgcagcatcta aagggatgtt ttgaaacttg 1500 gtaagaatag ctgattagga aagctttgttgagagggtag g 1541 137 464 PRT Rattus 137 Met Gln Pro Pro Leu Phe Val IleSer Val Tyr Leu Leu Trp Leu Lys 1 5 10 15 Tyr Cys Asp Ser Ala Pro ThrTrp Lys Glu Thr Asp Ala Thr Asp Gly 20 25 30 Asn Leu Lys Ser Leu Pro GluVal Gly Glu Ala Asp Val Glu Gly Glu 35 40 45 Val Lys Lys Ala Leu Ile GlyIle Lys Gln Met Lys Ile Met Met Glu 50 55 60 Arg Arg Glu Glu Glu His AlaLys Leu Met Lys Ala Leu Lys Lys Cys 65 70 75 80 Lys Glu Glu Lys Gln GluAla Gln Lys Leu Met Asn Glu Val Gln Glu 85 90 95 Arg Leu Glu Glu Glu GluLys Leu Cys Gln Ala Ser Ser Ile Gly Ser 100 105 110 Trp Asp Gly Cys ArgPro Cys Leu Glu Ser Asn Cys Ile Arg Phe Tyr 115 120 125 Thr Ala Cys GlnPro Gly Trp Ser Ser Val Lys Ser Met Met Lys Gln 130 135 140 Phe Leu LysLys Ile Tyr Arg Phe Leu Ser Ser Gln Ser Glu Asp Val 145 150 155 160 LysAsp Pro Pro Ala Ile Glu Gln Leu Thr Lys Glu Asp Leu Gln Val 165 170 175Val His Ile Glu Asn Leu Phe Ser Gln Leu Ala Val Asp Ala Lys Ser 180 185190 Leu Phe Asn Met Ser Phe Tyr Ile Phe Lys Gln Met Gln Gln Glu Phe 195200 205 Asp Gln Ala Phe Gln Leu Tyr Phe Met Ser Asp Val Asp Leu Met Glu210 215 220 Pro Tyr Pro Pro Ala Leu Ser Lys Glu Ile Thr Lys Lys Glu GluLeu 225 230 235 240 Gly Gln Arg Trp Gly Ile Pro Asn Val Phe Gln Leu PheHis Asn Phe 245 250 255 Ser Leu Ser Val Tyr Gly Arg Val Gln Gln Ile IleMet Lys Thr Leu 260 265 270 Asn Ala Ile Glu Asp Ser Trp Glu Pro His LysGlu Leu Asp Gln Arg 275 280 285 Gly Met Thr Ser Glu Met Leu Pro Glu GlnAsn Gly Glu Met Cys Glu 290 295 300 Glu Phe Val Lys Asn Leu Ser Gly CysLeu Lys Phe Arg Lys Arg Cys 305 310 315 320 Gln Lys Cys His Asn Tyr LeuSer Glu Glu Cys Pro Asp Val Pro Glu 325 330 335 Leu His Ile Glu Phe LeuGlu Ala Leu Lys Leu Val Asn Val Ser Asn 340 345 350 Gln Gln Tyr Asp GlnIle Val Gln Met Thr Gln Tyr His Leu Glu Asp 355 360 365 Thr Ile Tyr LeuMet Glu Lys Met Gln Glu Gln Phe Gly Trp Val Ser 370 375 380 Gln Leu AlaSer His Asn Pro Val Thr Glu Asp Ile Phe Asn Ser Thr 385 390 395 400 LysAla Val Pro Lys Ile His Gly Gly Asp Ser Ser Lys Gln Asp Glu 405 410 415Ile Met Val Asp Ser Ser Ser Ile Leu Pro Ser Ser Asn Phe Thr Val 420 425430 Gln Asn Pro Pro Glu Glu Gly Ala Glu Ser Ser Asn Val Ile Tyr Tyr 435440 445 Met Ala Ala Lys Val Leu Gln His Leu Lys Gly Cys Phe Glu Thr Trp450 455 460 138 1326 DNA Rattus 138 aaaacgacgg ccagtgcggc acgaggcacatcgtaaaaag tgaaggtcct ttcagaagtt 60 agtggcaatt tctctgggga gagctgcaatatcggtggaa cactggacca tgcagccacc 120 actctttgtg atttctgtgt atctgttatggttgaaatat tgtgacagtg cacctacttg 180 gaaggagaca gatgctacgg atggaaacctaaagagtctt ccagaggtag gagaggcaga 240 tgtagaggga gaggtcaaga aggctttgattggcattaag caaatgaaaa tcatgatgga 300 aaggagagag gaggaacacg caaaattgatgaaagccttg aagaagtgca aagaagaaaa 360 gcaggaggcc cagaaactca tgaacgaagtgcaagaacgt ctggaggaag aagaaaagct 420 atgtcaggca tcttctatag gttcttgggatggatgcagg ccatgtttgg aaagtaactg 480 catacgattt tatacagctt gccaacctggttggtcctct gtgaaaagca tgatgaagca 540 atttctcaag aagatatacc gatttctgtcttcccagagt gaagatgtaa aggatccccc 600 tgccatagaa cagctgacta aggaagatttacaagtggta cacatagaga acctgtttag 660 ccagctggcc gtggatgcaa aatctctcttcaacatgagc ttttacattt ttaagcagat 720 gcagcaagaa tttgatcagg cttttcaattatacttcatg tccgatgtgg acttaatgga 780 gccatacccc ccagctttat ctaaagagataaccaaaaaa gaagaacttg ggcaaaggtg 840 gggcattccc aatgtcttcc agctgtttcataatttcagt ctctctgttt atgggagagt 900 ccaacaaata ataatgaaga cactcaatgcaattgaagat tcatgggaac cacacaaaga 960 gttagaccag agaggtatga cttcagagatgttacctgag caaaatggag aaatgtgtga 1020 ggaatttgtc aagaatttat ctggatgtttaaaatttcgt aaaagatgcc aaaaatgtca 1080 caattaccta tctgaaggca gttccaaagattcatggagg agattcttcc aagcaggatg 1140 aaattatggt agactcaagc agcattctgccttcctctaa cttcaccgtc cagaatcctc 1200 ctgaagaagg tgctgagagc tcaaatgttatttactacat ggcagctaaa gttctgcagc 1260 atctaaaggg atgttttgaa acttggtaagaatagctgat taggaaagct ttgttgagag 1320 ggtagg 1326 139 344 PRT Rattus 139Met Gln Pro Pro Leu Phe Val Ile Ser Val Tyr Leu Leu Trp Leu Lys 1 5 1015 Tyr Cys Asp Ser Ala Pro Thr Trp Lys Glu Thr Asp Ala Thr Asp Gly 20 2530 Asn Leu Lys Ser Leu Pro Glu Val Gly Glu Ala Asp Val Glu Gly Glu 35 4045 Val Lys Lys Ala Leu Ile Gly Ile Lys Gln Met Lys Ile Met Met Glu 50 5560 Arg Arg Glu Glu Glu His Ala Lys Leu Met Lys Ala Leu Lys Lys Cys 65 7075 80 Lys Glu Glu Lys Gln Glu Ala Gln Lys Leu Met Asn Glu Val Gln Glu 8590 95 Arg Leu Glu Glu Glu Glu Lys Leu Cys Gln Ala Ser Ser Ile Gly Ser100 105 110 Trp Asp Gly Cys Arg Pro Cys Leu Glu Ser Asn Cys Ile Arg PheTyr 115 120 125 Thr Ala Cys Gln Pro Gly Trp Ser Ser Val Lys Ser Met MetLys Gln 130 135 140 Phe Leu Lys Lys Ile Tyr Arg Phe Leu Ser Ser Gln SerGlu Asp Val 145 150 155 160 Lys Asp Pro Pro Ala Ile Glu Gln Leu Thr LysGlu Asp Leu Gln Val 165 170 175 Val His Ile Glu Asn Leu Phe Ser Gln LeuAla Val Asp Ala Lys Ser 180 185 190 Leu Phe Asn Met Ser Phe Tyr Ile PheLys Gln Met Gln Gln Glu Phe 195 200 205 Asp Gln Ala Phe Gln Leu Tyr PheMet Ser Asp Val Asp Leu Met Glu 210 215 220 Pro Tyr Pro Pro Ala Leu SerLys Glu Ile Thr Lys Lys Glu Glu Leu 225 230 235 240 Gly Gln Arg Trp GlyIle Pro Asn Val Phe Gln Leu Phe His Asn Phe 245 250 255 Ser Leu Ser ValTyr Gly Arg Val Gln Gln Ile Ile Met Lys Thr Leu 260 265 270 Asn Ala IleGlu Asp Ser Trp Glu Pro His Lys Glu Leu Asp Gln Arg 275 280 285 Gly MetThr Ser Glu Met Leu Pro Glu Gln Asn Gly Glu Met Cys Glu 290 295 300 GluPhe Val Lys Asn Leu Ser Gly Cys Leu Lys Phe Arg Lys Arg Cys 305 310 315320 Gln Lys Cys His Asn Tyr Leu Ser Glu Gly Ser Ser Lys Asp Ser Trp 325330 335 Arg Arg Phe Phe Gln Ala Gly Glx 340 140 18596 DNA Homo sapiens140 cctgtagtcc cagctacgcg agaggctgag gcagcagaat tacttgaacc caggaggcgg 60aggttgcagt gagccgagat cgcgccactg cactccagcc tgggtgagag agcgagactc 120tgtctcaaaa aaaaaaaaaa aagaccgcca gggctcaaac aaaaaacctc ggaaaagccc 180tggcggtctt tttttttttt tttttttttt ttttttggga cagtcttgct ctgtcgccca 240ggctggagta caatggtcgg atcttggctc actgcaacct ctgcctccca ggttcaagca 300attcttctgc ctcagcctcc caagtagcca ccacgcccag ctaatttttg tacttttagt 360agagacgggg gtttcaccat gttgtccagg ctggtcttga actcctgacc tcaggtgatc 420cacccgcctc ggccccccaa agtactagga ttacaggcgt gagccaccgc gtccagcgcc 480ctggcggttt ttaatcaagt agaaaagctg cattatacca cttgcttcgg ttgcttcagt 540gagaacgaag aaatggaaat gcaaatccct tattagttgt aggaaacaga tctcaaacag 600cagttttgtt gacaagaccg caggaaaacg tgggaactgt gctgctggct tagagaaggc 660gcggtcgacc agacggttcc caaagggcgc agtccttccc agccaccgca cctgcatcca 720ggttcccggg tttcctaaga ctctcagctg tggccctggg ctccgttctg tgccacaccc 780gtggctcctg cgtttccccc tggcgcacgc tctctagagc gggggccgcc gcgaccccgc 840cgagcaggaa gaggcggagc gcgggacggc cgcgggaaaa ggcgcgcgga aggggtcctg 900ccaccgcgcc acttggcctg cctccgtccc gccgcgccac ttggcctgcc tccgtcccgc 960cgcgccactt cgcctgcctc cgtcccccgc ccgccgcgcc atgcctgtgg ccggctcgga 1020gctgccgcgc cggcccttgc cccccgccgc acaggagcgg gacgccgagc cgcgtccgcc 1080gcacggggag ctgcagtacc tggggcagat ccaacacatc ctccgctgcg gcgtcaggaa 1140ggacgaccgc acgggcaccg gcaccctgtc ggtattcggc atgcaggcgc gctacagcct 1200gagaggtgac gccgcgggcc cctgcgggac gggtggcggg aaggagggag gcgcggctgg 1260ggagagcgct cgggagctgc cgggcgctgc ggaccccgtt tagtcctaac ctcaatcctg 1320ccagggaggg gacgcatcgt cctcctcgcc ttacagacgc cgaaacggag ggtcccatta 1380gggacgtgac tggcgcgggc aacacacaca gcagcgacag ccgggaggta agccgcgtcc 1440cagcggctcc gcggccgggc tcgcagtcgc cccagtgatg ccgtggcccc cgaggcgggc 1500gtcatcgggc agcgtttgcc cagtgctgga gggttaggga gagctgcctg ggcttgaccg 1560cgcgccggtc tcaaagtcct ggctttggcc cctcctccgt tttcccctgt ggaccattcc 1620gcttcgcagc gttttcaaaa actggagcga aagtgatgtg ggcggggcaa aggcggcggg 1680aagaggacag cactgaagct ggcgcgggaa cttggtttcc tggtggcctc ccatccaatc 1740cccacgaacc agctttcctc ttaaaccttg aaaagagaaa ttcgggagtt cgagttctta 1800gtcgtccttt cctctttcct ttccgacagg agcaccccag gcaaaaaatg tctcgcgggt 1860cattggcgcc aggctttcag gggacagtgg ggcggggcgg ggtgggcaca ggacgttagg 1920cagccgttgg ccctccctaa ggccacaccg tcctgccgtc ctggatcctg cgccagctgc 1980gcgggggagg ggactcgaag gtgtgtgagc caggggctga ccttgaccgc tcagataaat 2040ggagcgcagc cttgacacag gggtggaggt ggttttgaat ggggaaaccc attcgtggtg 2100aagcagattc actgtagcta gcggaaaagc cctccggccc acggacccat ctagagacga 2160atacatagca gctgctgtgg ctgattggcg tgggacagcg tggggagttt tgtctgagga 2220gagggatcca cttttctgca gctccaagcc caggggcctt tgatgagcca tagacctcat 2280ttttaaccca cctttctgct tagacattga gcaagttact tctcatatag cttccctata 2340tgttaaaaat ggagaaaata atgcttagta ggcaattctg ataaaagcag gtgcttgcaa 2400aaatctctct gttgtctgaa tataaactgt accacaagcg agtgcggatg aacgaggact 2460gcatttaaag ataagttttt acactttcat ttctctgtgg ctcgacactt ctgatgcctc 2520cctttttgtt cctgggacac atgcttggtg ttgtcttcac acctttgtga caggattagc 2580actagtgggc agtggatgat agctcctcct cccttttgcc acatgttcat ccctgccctc 2640gccaccatct cactgtgtgg aattcctgtg tccactggtc accggggcac agaagtgctg 2700tctcagcctg aatcgggcca ctgatgggac ttgcagcctg ggagctccac cgtgatctct 2760ggcccacttt gcgggagtct aggctttctg gatgctccag gcctcacgtc ccagggcagt 2820tttcttccct gaagaaagtt ggatggcatg atctgtcttc ccatcttgaa accgtatggc 2880aaattgtttt tcagatgaat tccctctgct gacaaccaaa cgtgtgttct ggaagggtgt 2940tttggaggag ttgctgtggt ttatcaaggt aaagaagtcg ctgctattag aagtcagtag 3000tctgttctca acacagcagc cagtgagatc ctttcaaaac tcaaagcagc caggtgtggt 3060ggctcacgcc tgtaatccca ccgctttggg aggctgagtc agatcacctg aggttaggaa 3120tttgggacca gcctggccaa catggcgaca ccccagtctc tactaataac acaaaaaatt 3180agccaggtgt gctggtgcat gtctgtaatc ccagctactc aggaggctga ggcatgagaa 3240ttgctcacga ggcggaggtt gtagtgagct gagatcgtgg cactgtactc cagcctggcg 3300acagagggag aacccatgtc aaaaacaaaa aaagacacca ccaaaggtca aagcatatca 3360ttcctcaccc tcaagccctt agtggctcca tttcactcag taagagccac ggtccttatg 3420gtgtccgttt ttcagctctg accttagctg ctgctctctg caccaccctg ctgttcttgt 3480gagtttttga gcacaccggg acatccccac tccctggaac cttcttcccc cacacttggc 3540ttcttccttt gagtctctac tccactcggg caagccttcc tagacctcct gatttaaaac 3600tgtgactctc ccccaacctc cttggtgttt ctccgtagac gaacatcacc atctgatgta 3660tgtcagcctt tcccttcccc tgttagaagg gggacagcag gtagtaaaag tgaaatgtgc 3720tgtaagcttt atgagggcag aggatttgtt tctcgtgttc actgttgtat cgccagggcc 3780tcaaacacag cctgccacat agtaggagtc aacatatatt gatcactaaa tgtagatacc 3840acctgtgttc ccatgttcat ataaattcta gaagagtctc ttcagtaaca aggtgaaccc 3900cttccagagg gctgagtagg tacctcaggc cggggccaga gtgctgtgaa gacagcagca 3960gcccagacca agcttctctg tgttccgtgt cctggtctag aaccagcgat gttctttctg 4020accagtgctt tttggaaggt ggctgaggtc tgggctcagg tctgggccat actagaagct 4080gggatccctt ctatagagca cttggtatgg cttgtatggt cttggggcaa gccagaccca 4140agccctctta tcccatttta gaaagggctt caatttggat ccagccccag gtctgcctta 4200gctctgtatt cttggggtat tttgttctgt attggcctat cttgactaac aatgagcctt 4260ggatttgaaa catatcatca gaaacctcag aagacaacat tcttaaactg gctagagcct 4320ggtctgaatg gatgaaaagg agagactttt gaagcaatat gtaaaagatt gagaaatgat 4380ttgttggaaa tttctcaatt ggagaaattt ctttgatttg ttggaaattt ctttgattct 4440ttctcaatca aagaaaatcg ggacaaactc aacaatagaa agggaggaag caagatactc 4500agaaataaaa tgcattcccc tgtttcaact taatgcttca attcaggatt ctaaggaatc 4560cttgccagga atgtcagact caccttgata gttggagtta ctccattggt gactcgatca 4620aatacaggag ttgaggcacc tgcactgtaa aatactgatt agtctgatca ttaggaatat 4680cctgtatgcc aggtagaaga tacattgaac agattgcatg taggcattaa attcattttg 4740gggtattaca tatagacaac acatttcatt aagaaacata aaactgtcag atcggtggaa 4800tacttaaaag cacttggagg tgtttagcct aaaaagctta gttgagggga atggaagaaa 4860agatctggga gggtggttcc aaagaaggga tcagactatc ctaaagccct caggaatctg 4920ggctgggacc acctacttaa agataggatg ggcagctggg tgtggtggct cacgcctgta 4980atcccagcac ttcgggaggc cgaagcgggc ggatcacctg aggtcaggag ttcgaggcca 5040gcctgaccaa catggagaaa cgctgtctct actaaaaata caaaattagc tgggtgtagt 5100ggcgcatgcc tgtaatccca gctactcggg aggctgaggc aggggaatcg cttgaacctg 5160ggaggtggag ggtgccgtga gccacgatcg cgccattgca ctccagcctg ggcaacaaga 5220gcgaaactct caaaaaacaa aaaaaaggat gggttccata tgggtggtgt caagtgccca 5280cctcctagca agtcagcagg ggccagaggc ccttgtaagt ggtgtctcgg ggggatcaac 5340tgagatggct taagatttac ctggatgcct gctctgctct ccccatctct tccagggatc 5400cacaaatgct aaagagctgt cttccaaggg agtgaaaatc tgggatgcca atggatcccg 5460agactttttg gacagcctgg gattctccac cagagaagaa ggggacttgg gcccagttta 5520tggcttccag tggaggcatt ttggggcaga atacagagat atggaatcag gtgaggagat 5580agaacaatgc cttccatttc cgggtgccct tcctagcacg tgtttgctcc gttgttttag 5640ataaggtctg ggggatgagt caatgtcaca ggagctgatg tatagctttg accttgtgag 5700gggtggtgcc aggttgaagc cacaattaac gcctactgaa ggccgtttca catctttttt 5760tttttttttt ttttaattat tatactttaa gttttagggt acatgtgcac aatgtgcagg 5820ttagttacat atgtatacat gtgccatgct ggtgcgctgc accactaact caccatctag 5880catcaggtat atctcccaat gctatccctc ccccctcctc ccaccccaca acatccccag 5940agtgtgatgt tccccttcct gtgtccatat gttctcgttg ttcgattccc actatgagtg 6000agaatatgcg gtgtttggtt ttttgttctt gcgatagttt actgagaatg atgatttcca 6060tttcaccacg tccctacaga ggacatgaac tcatcatttt ttatggctgc atagtattcc 6120atggtgtata tgtgccacat tttcttaatc cagtctatca tgttggacat ttgggttggt 6180tccaagtctt tgcctattgt gaatagtgcc acaataaaca tacgtgtgca tgtgtcttta 6240tagcagcatg atttaatagt cctttgggta tatacccagt aatgggatgg ctgggtcaaa 6300tggtatttct agttctagat ccccgaggaa tcgccacact gacttccaca atggttgaac 6360tagtttacag tcccaccaac agtgtcaaag tgtcctattt ctccacatcc tctccagcac 6420ctgttgtttc ctgacttttt aatgattgcc attctaactg gtgtgagatg gtatctcatt 6480gtggttttga tttgcgtttc tctgatggcc agtgatggtg agcatttttt catgtgtttt 6540ttggctgcat aaatgtcttc ttttgagaag tgtctgttca tgtccttcgc ccactttttg 6600atggggttgt ttttttctta taaatttgtt tgagttcatt gtagattctg gatattagcc 6660ctttgtcaga tgagtaggtt gcaaaaatgt tctcccattt tgtgggttgc ctgttcactc 6720tgatggtagt ttcttttgct gtgcagaagc tctttagttt aattagatcc catttgtcaa 6780ttttggcttt tgttgccatt gcttttggca taggcatgaa gtccttgccc atgcctatgt 6840cctgaatggt aatgcctagg ttttcttcta gggtttttat ggttttaggt ctaacgttta 6900agtctttaat ccatcttgaa ttgatttttg tataaggtgt aaggaaggga tccagtttca 6960gctttttaca tatggctagc cagttttccc agcaccattt attacatagg gaatcctttc 7020cccattgctt gtttttctca ggtttgtcaa agatcagata gttgtagata tgcggcgtta 7080tttctgaggg ctctgttctg ttccattgat ctatgtgtct gttttggtac cagtaccata 7140ctgttttggt tactgtagcc ttgtagtata gtttgaagtc aggtagcgtg atgcctccag 7200ctttgttctt ttggcttagg attgacttgg cgatgcgggc tcttttttgg ttccatatga 7260actttaaagt agttttttcc aattctgtga agaaagtcat tggtagcttg atggggatgg 7320cattgaatct ataaattacc ttgggcagta tggccatttt cacgatattg attcttccta 7380cccatgagca tggaatggtc ttccatttct ttgtatcctc ttttatttca ttgagcagtg 7440gtttgtagtt ctccttgaag aggtccttca catccctttt aaggtggatt cctaggtatt 7500ttattctctt tgaagcaatt gtgagtggaa gttcactcat gatttggctc tctgtttgtc 7560tgttattggt gtataagaat gcttgtgatt tttgcagatt gattttatat cctgagactt 7620tgctgaagct gcttatcagc ttaaggagat tttgggctga gacaatgggg ttttctagat 7680atacaatcat gtcgtctgca aacagggaca atttgacttc ctcttttcct aattgaatac 7740cctttatttc cttctcctgc ctaattgccc tggccagaac ttccaacact atgttgaata 7800ggagtggtga gagagggcat ccctgtcttg tgccagtttt caaagggaat gcttccagtt 7860tttgcccatt cactatgata ttggctgtgg ctttgtcata gatagctctt attattttga 7920aatatgttcc atcaatacct aatttattga gagtttttag catgatgtgt tgttgaattt 7980tgtcaaaggc tttttctgca tctattgaga taatcatgtg gtttttgtct ttggatctgt 8040ttatatgctg gattacattt attgatttgc gtatattgaa ccagccttgc atcctaggga 8100tgaagcccac atgatcatgg tggataagct ttttgatgtg ctgctggatt cggtttgcca 8160gtattttatt gaggattttt gcatcaatgt tcatcaagga tattggtcta aaattctctt 8220ttttggtgtg tctctgccca gctttggtat caggatgatg ttggcttcat aaaatgagtt 8280agggaggatt ccctcttttt ctattgattg gaatagtttc agaaggaatg gtaccagttc 8340ctctttgtac ctctggagaa ttcggctgtg aatccatctg gtcctggact ctctttggtt 8400ggtaagctat tgattattgc cacaatttca gctcctgtta ttggtctatt cagagattca 8460acttcttcct ggtttagtct tgggagagtg tatgtgtcaa ggaatttatc catttcttct 8520agattttcta gtttatttgc gtagaggtgt ttgtagtaat ctctgatggt agtttgtatt 8580tctgtgggat cggtggtgat atccccttta tcatttttta ttgcgtctat ttgattcttc 8640tctttttctt tattagtctt gctagcggtc tataaatttt gttgatcctt tcaaaaaacc 8700agctcctgga ttcattaatt ttttgaaggg ttttttgtgt ctctatttcc ttcagttctg 8760ctctgatttt agttatttct tgccttctgc tagcttttga atatgtttgc tcttgctttt 8820ctagttcttt taattgtgat gttagggtgt caattttgga tctttcctgc tttctcttgt 8880gggcatttag tgctataaat ttccctctac acactgcttt gaatgtgtcc cagaggttct 8940ggtatgttgt gtctttgttc ttgttggttt caaagaacat ctttatttct gccttcattt 9000cgttatgtac ccagtagtca ttcaggagca ggttgttcag tttccatgta gttgagcagt 9060tttgagtgag attcttaatc ctgagttcta gtttgattgc actgtggtct gagagatagt 9120ttgttataat ttctgttctt ttacatttgc tgaggagagc tttacttcca actatgtggt 9180cggttttgga ataggtgtgg tgtggtgctg aaaaaaatgt atattctgtt gatttgggat 9240ggagttctgt agatgtctat taggtctgct tggtgcagag ctgagttcaa ttcctgggta 9300tccttgttga ctttctgtct cgttgatctg tgtactgttg acagtgggtg ttaaagtctc 9360ccattattaa tgtgtggagt ctaagtctct ttgtaggtca ctcagatgat tggcacttac 9420tgggcgcttg gcactttcca tactgtgtca tcggcagata gctgcatggt tggtgttcgt 9480gctggggaat gggaagttca tcggtgggac aaggacaaaa tgcccccatt gctttgttgt 9540ggctttaatc tccctttcga ggctgagcca cagcgtgctg taggtggcgc tgctgtgaag 9600cgcagtacca gggtcacact ccactcccag ctctgcagag gtggagaaag aatgaaacat 9660ctcactcctg gacttccact ttcctgtcac tgttggtgtc acctcttact ggatgtcaca 9720gagcccagcc cctcccacct gtgcctagga aaagcagatg ccaccttgga atgtggggtt 9780tgtgtgtgca atttactagc tgggcagaga ccagcaacct ggagagcagg tgtctcgtct 9840aaggggacag tcacatttca cctccagcca cctggaggaa tttgggcctg gtgatgtcag 9900aattcttcaa taaaagccta aaatctatat tttatgtgcg gtcatgagat ctgttaaatg 9960ttagcaactt caggaagttt aaaaatgctg tgtggaccta gaataggcaa gttcttaaag 10020gcagaaagtg gaatgctagt ttccagggac tggggaacag ggaggaatgg ggagttcatg 10080tttaatgggc acagaggttt tgttagggat gacgaaaaag ttcgggagat ggtgatggtg 10140atggagatgg tgatggtgat ggagatggtg atggtgatgg tgatggtgat gggtgatggt 10200gatggtgatg gtgatggtga tggagatggt gatggtgatg gtgatggaga tggtgatggt 10260gatggtgatg gtgatggaga tggtgatggt gatggagatg gtgatggtga tggtgatgga 10320gatggtgatg gtgatggtga tggtgatggt gatggtgatg gtgatggaga tggagatggt 10380gatggtgatg gttgcctaac atcaggaacg tgcttaatgc ttctgaattg cacacaaaaa 10440tggcaagttt aatattatgt gtactttatc acaatgaaaa aagctgctgc gtgggccaag 10500ttacttgtgc aggtaatgtt ctgcaggtgg ttgcctgcac ctcagttgta gggtgtccgt 10560aggatgtgag gccagtcccc gggcttaatg atgctttaaa tcctgcctag tattcaatta 10620tttcttgtcg cttaaaaggc ctaataaaat tatggtctta gtttacagtg gtatgaatgc 10680ttagctgttg gattttagta ggaaagttcg tccctttttg tttttaattt tgttttacag 10740attcacagga attttttttt tttttttttt tttttttttt taatgcacag aaagtttccc 10800tggactctct acccagtttc cccagtgata atatcttggg taacatcctg tatacattca 10860cattggtgca ttcctcagag ttgtcagatt ttgctagttt tacgtgcact tgtgtatgtg 10920tgtatttgca attttagcac gtgtagactc ttgtaaccac tacaatcaag ttacagaact 10980acactaccaa ggttcatctt tttaaaatct ttgatgttac cttttttgga acagtgacca 11040tgagaggact ttcctcccaa aattttgaaa actactgaac cagaatatag tctgacacta 11100ataggtagaa atttaaccaa aggagattat gaagctctgc acttgagtta acaaaatcac 11160ttctcagctt ccagttccat ctcagaagga aggaaaaggg attaaaaatc cagagaccag 11220aaaatgggag caaagtacaa ggtggtgtaa tcattacaga ggtttcctga tgtttccaag 11280tcagtcgtgt gttgagctgc taaactctaa agtaatttta ggtggaatgt tggaaacatg 11340ctgctgaggt gatagaaagg aatccatggt cctctgttag ttggaaagta tatggaatac 11400tatattctac ataagataca atactctctg tgagacaagg ataaagtaga ttttgtcagt 11460gaaattgtga caagaatcgc tgatgggttt agagcctaag tttgcgagga gcactggaag 11520aaattaagat tgttgagatt ggaaagggtt agctatgggg gaacaggagg aggtgactcc 11580atgacagacc aaatattcaa aggactgtgt agaagaggaa aaagactttg ttagggctcc 11640agaggacaga gccaggagtc agacagggcc ttgaactcaa cccaccgaga tctgcaaact 11700ttgcaggatg caccagatgt cttgtagcca tgggtcaagg ggggaccctg ggtaagagac 11760tgtaatagat gacctctaag gccatctcat gacatgtgtg attaatgtat gtacctgtcc 11820tctctttttg acaattctac agattattca ggacagggag ttgaccaact gcaaagagtg 11880attgacacca tcaaaaccaa ccctgacgac agaagaatca tcatgtgcgc ttggaatcca 11940agaggttgaa agaaccccgt cgtcttcatt tatactaacc atactcttag agggaagcaa 12000tctggttttg tgcagaggca ctgagggagg caggaccctg ggcaacttcc cccagccaca 12060tggttgtgtg acgttgggca agtcacattt tgctgcactt tcaccttcag atcatgaggt 12120tgggcccaga ggattttttt tttttttttt ttttttgaga cagagttttg ctctgttgcc 12180caggctggaa tgcaacggcg tgatcttggc tcactgtaac ctctgcctcc tgggttcgag 12240tgattctcct gcctcagcct ccaagtagct gggattacag catgtgccac catgcctggc 12300taattttgta tttttagtag agacgggttc acatgttggt caggctggtc ttgactcctg 12360accctcagat gatctgcctt gcctcagcct cccaaccgag tgatcttaag ttgtgtatta 12420tactcattct tacacaaaaa gggctttaaa tgcctagaaa ctacatgaag atgttaacat 12480tttaaatgga agcagatgaa gttccagctc gctgccacct cactaacatt tttaacaatt 12540atattgtaaa attcaactct accagggtgt agagccaggt gtggtggctc acacctgtaa 12600ttccaacaac tccagaggcc aaggcgagag gatcatttga acccacggaa tttgaggctg 12660tagtgagtca tgatcacgcc attgcactcc atcctgggca acagagtgag accctgaata 12720tttaaaaaca acaacaacaa caaaactcta tcaggatatc ataagtactt agagtgaaat 12780acttgcatct gtaatagaga cttatttttt ttttttttga gacacagtct caccctgttg 12840cccaggctgg agtgcagtgg tttgatctcc gctcacggca acctccatct cccaggttca 12900agtgagttcc cattcctcag ccccagagct gggaccacag gcgcgcgaat ttttgtattt 12960ttagcagaga cggggtttca ctatgttggc caggctagtc tcaaactcaa gttggcctca 13020agtgatctgc ccaccctggc gtcccagtgt tgggatttca ggcatgagcc actgtgcctg 13080gccatgtaat agagactttt aatataggag ggtgtaccag aagcaccagt ttcctgtggc 13140aaacagaatt attcctgctg tatttgtaat ttggtgccac gaggtagccc agatcccttc 13200agctctgatg gaagagcatt gcttcagccg taaatggaca cctgcagaaa ccttgcaccg 13260atggatagtc tccctcagct ccgtgccatc gctgcagggg ctgttatgga catcactgca 13320gcccagtggc tctctctcct ggtctccacc atatgagttg gcttctgttt ctctcctgtt 13380ttactttgcc tttagctgtg gtctttcaaa ccaccatccc tccttatctt cctctgctgg 13440ttcctcagat cttcctctga tggcgctgcc tccatgccat gccctctgcc agttctatgt 13500ggtgaacagt gagctgtcct gccagctgta ccagagatcg ggagacatgg gcctcggtgt 13560gcctttcaac atcgccagct acgccctgct cacgtacatg attgcgcaca tcacgggcct 13620gaaggtgggc tgtctcggga agggtgactt gccagcctac cacatgagct cttcagttct 13680ttaatatggg aaaacaaatt gcagagttta gtctctgatt agcttttaaa tttgatatgt 13740gtaagtaaga catgaaccag cttttacttt gaaaccttcc ttttctggaa ggttttctgg 13800ccctgtggta tatgcactaa cagatctata caggttgttt gtgatacagc ttctatggat 13860cttctcaaaa gctatgctga ggttgggtat ggtggctcat gcctgtaatc ccagcacttt 13920ggaagactga gacaggagca attgcttgag gtctggagtt caataccagc ctgggcaaca 13980taacaagatg ctgttgctac aaaaaaatgg aaaagctaca ctaaattatt tttttaaaaa 14040aagccttgcg gtgtctgcat attctaatgt ttttaaatga tgttttaaag aattgaaact 14100aacatactgt tctgctttct cccggtttat agccaggtga ctttatacac actttgggag 14160atgcacatat ttacctgaat cacatcgagc cactgaaaat tcaggtaaga attagatgtt 14220atacttttgg gtttggtacc ttctcttgat aaaaggttga ctgtggaaca ggtatctgct 14280caatgctgtg tccaagataa agatgactgc tccaaatgtg gggcttcagt ttagggagaa 14340gtggtgggca ggtgggcagg acaaggcagg catctgcctc agcaaccatg gcacttaact 14400tgtcaggtgc tgtgaggtac taagcaccag taccagagag ggaagagcca cattcaagcc 14460aggggattgt ccaaaaggag gcattttaac tcattttaac ttgaaggaga attgaagtgc 14520aaatgttttt ccttttcttt ttttttgaga tggagtcttt ctctgtcggc caggctggag 14580tgtgccgtgg tgcgatctca gctcactgca acctccacct cccgggttca agcaattctt 14640ctgcctcagc ctcccaggta gctgggatta caggcacatg ccaccacacc cagctaattt 14700tttgtattat tagtagagat ggggtttcgt catgttggcc aggctgatct caaactcctg 14760acttcaagtg taccacctgc ctcagcctcc gaaagttctg gaattacagg cataagccac 14820caccctggcc ataaatattt tttgttaatt ttacattaag tacaatattt aggtccaaac 14880ttcaaaagtc tgttgaaatc cctgaagtta tagcagccaa caattgatat gaaatggcaa 14940taaaaatgta agttcatctg cttcatgagc cttaaggaaa aaaactcaga accagacact 15000ttttagcccc ttccaggtta gatccaggtt ttaaaagtta ttcctttgag ggagtttggc 15060tgcttttgag tggaggtgac ttcaggctta ttctctctgg ctctctgctc tggtcatttt 15120tagacatagt aataggttgt gacctgtctt cacatcctaa ttgccactgt ctgttcatcc 15180caggaatcct ggctttcatc cctttctgtt cactgtccat gcatgtcatc tttccttctt 15240tctgccaggg accagatggg ttagggattg tgaattcaag taaacgtaga gctactatga 15300gttacagatt gactgtgttc ctgtctttaa taaatttgcc aagagtggtt ataagaactt 15360acacctgatg aggcaccagg ctcctgatgc tgtgtaatgt cacaaaatac ccctcactct 15420cgatctgtgc aagagaacag ctggttgcgc tccaatcatg ttacataacc tacgcgaagg 15480tatcgacagg atcatactcc tgtaaaatag aactttgttg atcacatcct gtgtacttgt 15540ttcacggaca tgaggagcaa ttacaacagg tcgtacaatt atggcaaaat aatggcctta 15600ttttgttttt agcttcagcg agaacccaga cctttcccaa agctcaggat tcttcgaaaa 15660gttgagaaaa ttgatgactt caaagctgaa gactttcaga ttgaagggta caatccgcat 15720ccaactatta aaatggaaat ggctgtttag ggtgctttca aaggagctcg aaggatattg 15780tcagtcttta ggggttgggc tggatgccga ggtaaaagtt ctttttgctc taaaagaaaa 15840aggaactagg tcaaaaatct gtccgtgacc tatcagttat taatttttaa ggatgttgcc 15900actggcaaat gtaactgtgc cagttctttc cataataaaa ggctttgagt taactcactg 15960agggtatctg acaatgctga ggttatgaac aaagtgagga gaatgaaatg tatgtgctct 16020tagcaaaaac atgtatgtgc atttcaatcc cacgtactta taaagaaggt tggtgaattt 16080cacaagctat ttttggaata tttttagaat attttaagaa tttcacaagc tattccctca 16140aatctgaggg agctgagtaa caccatcgat catgatgtag agtgtggtta tgaactttaa 16200agttatagtt gttttatatg ttgctataat aaagaagtgt tctgcattcg tccacgcttt 16260gttcattctg tactgccact tatctgctca gttccttcct aaaatagatt aaagaactct 16320ccttaagtaa acatgtgctg tattctggtt tggatgctac ttaaaagagt atattttaga 16380aataatagtg aatatatttt gccctatttt tctcatttta actgcatctt atcctcaaaa 16440tataatgacc atttaggata gagttttttt tttttttttt taaactttta taaccttaaa 16500gggttatttt aaaataatct atggactacc attttgccct cattagcttc agcatggtgt 16560gacttctcta ataatatgct tagattaagc aaggaaaaga tgcaaaacca cttcggggtt 16620aatcagtgaa atatttttcc cttcgttgca taccagatac ccccggtgtt gcacgactat 16680ttttattctg ctaatttatg acaagtgtta aacagaacaa ggaattattc caacaagtta 16740tgcaacatgt tgcttatttt caaattacag tttaatgtct aggtgccagc ccttgatata 16800gctatttttg taagaacatc ctcctggact ttgggttagt taaatctaaa cttatttaag 16860gattaagtag gataacgtgc attgatttgc taaaagaatc aagtaataat tacttagctg 16920attcctgagg gtggtatgac ttctagctga actcatcttg atcggtagga ttttttaaat 16980ccatttttgt aaaactattt ccaagaaatt ttaagccctt tcacttcaga aagaaaaaag 17040ttgttggggc tgagcactta attttcttga gcaggaagga gtttcttcca aacttcacca 17100tctggagact ggtgtttctt tacagattcc tccttcattt ctgttgagta gccgggatcc 17160tatcaaagac caaaaaaatg agtcctgtta acaaccacct ggaacaaaaa cagattttat 17220gcatttatgc tgctccaaga aatgctttta cgtctaagcc agaggcaatt aattaatttt 17280tttttttttg acatggagtc actgtccgtt gcccaggctg cagtgcagtg gcgcaatctt 17340ggctcactgc aacctccacc tcccaggttc aagtgattct cctgcctcag cctcccatgt 17400agctgggatc acaggcacct gccaccatgc ccggctaatt ttttgtattt tttgtagaga 17460cagggtttca ccatgttggc caggctggtc tcaaacacct gacctcaaat gatccacctg 17520cctcagcctc ccaaagtgtt gggattacag gcgtaagcca ccatgcccag ccctgaatta 17580atatttttaa aataagtttg gagactgttg gaaataatag ggcagaggaa catattttac 17640tggctacttg ccagagttag ttaactcatc aaactctttg ataatagttt gacctctgtt 17700ggtgaaaatg agccatgatc tcttgaacat gatcagaata aatgccccag ccacacaatt 17760gtagtccaaa ctttttaggt cactaacttg ctagatggtg ccaggttttt ttgcacaagg 17820agtgcaaatg ttaagatctc cactagtgag gaaaggctag tattacagaa gccttgtcag 17880aggcaattga acctccaagc cctggccctc aggcctgagg attttgatac agacaaactg 17940aagaaccgtt tgttagtgga tattgcaaac aaacaggagt caaagcttgg tgctccacag 18000tctagttcac gagacaggcg tggcagtggc tggcagcatc tcttctcaca ggggccctca 18060ggcacagctt accttgggag gcatgtagga agcccgctgg atcatcacgg gatacttgaa 18120atgctcatgc aggtggtcaa catactcaca caccctagga ggagggaatc agatcggggc 18180aatgatgcct gaagtcagat tattcacgtg gtgctaactt aaagcagaag gagcgagtac 18240cactcaattg acagtgttgg ccaaggctta gctgtgttac catgcgtttc taggcaagtc 18300cctaaacctc tgtgcctcag gtccttttct tctaaaatat agcaatgtga ggtggggact 18360ttgatgacat gaacacacga agtccctctg agaggttttg tggtgccctt taaaagggat 18420caattcagac tctgtaaata tccagaatta tttgggttcc tctggtcaaa agtcagatga 18480atagattaaa atcaccacat tttgtgatct atttttcaag aagcgtttgt attttttcat 18540atggctgcag cagctgccag gggcttgggg tttttttggc aggtagggtt gggagg 18596 1411536 DNA Homo sapiens 141 gggggggggg ggaccacttg gcctgcctcc gtcccgccgcgccacttggc ctgcctccgt 60 cccgccgcgc cacttcgcct gcctccgtcc cccgcccgccgcgccatgcc tgtggccggc 120 tcggagctgc cgcgccggcc cttgcccccc gccgcacaggagcgggacgc cgagccgcgt 180 ccgccgcacg gggagctgca gtacctgggg cagatccaacacatcctccg ctgcggcgtc 240 aggaaggacg accgcacggg caccggcacc ctgtcggtattcggcatgca ggcgcgctac 300 agcctgagag atgaattccc tctgctgaca accaaacgtgtgttctggaa gggtgttttg 360 gaggagttgc tgtggtttat caagggatcc acaaatgctaaagagctgtc ttccaaggga 420 gtgaaaatct gggatgccaa tggatcccga gactttttggacagcctggg attctccacc 480 agagaagaag gggacttggg cccagtttat ggcttccagtggaggcattt tggggcagaa 540 tacagagata tggaatcaga ttattcagga cagggagttgaccaactgca aagagtgatt 600 gacaccatca aaaccaaccc tgacgacaga agaatcatcatgtgcgcttg gaatccaaga 660 gatcttcctc tgatggcgct gcctccatgc catgccctctgccagttcta tgtggtgaac 720 agtgagctgt cctgccagct gtaccagaga tcgggagacatgggcctcgg tgtgcctttc 780 aacatcgcca gctacgccct gctcacgtac atgattgcgcacatcacggg cctgaagcca 840 ggtgacttta tacacacttt gggagatgca catatttacctgaatcacat cgagccactg 900 aaaattcagc ttcagcgaga acccagacct ttcccaaagctcaggattct tcgaaaagtt 960 gagaaaattg atgacttcaa agctgaagac tttcagattgaagggtacaa tccgcatcca 1020 actattaaaa tggaaatggc tgtttagggt gctttcaaaggagcttgaag gatattgtca 1080 gtctttaggg gttgggctgg atgccgaggt aaaagttctttttgctctaa aagaaaaagg 1140 aactaggtca aaaatctgtc cgtgacctat cagttattaatttttaagga tgttgccact 1200 ggcaaatgta actgtgccag ttctttccat aataaaaggctttgagttaa ctcactgagg 1260 gtatctgaca atgctgaggt tatgaacaaa gtgaggagaatgaaatgtat gtgctcttag 1320 caaaaacatg tatgtgcatt tcaatcccac gtacttataaagaaggttgg tgaatttcac 1380 aagctatttt tggaatattt ttagaatatt ttaagaatttcacaagctat tccctcaaat 1440 ctgagggagc tgagtaacac catcgatcat gatgtagagtgtggttatga actttatagt 1500 tgttttatat gttgctataa taaagaagtg ttctgc 1536142 313 PRT Homo sapiens 142 Met Pro Val Ala Gly Ser Glu Leu Pro Arg ArgPro Leu Pro Pro Ala 1 5 10 15 Ala Gln Glu Arg Asp Ala Glu Pro Arg ProPro His Gly Glu Leu Gln 20 25 30 Tyr Leu Gly Gln Ile Gln His Ile Leu ArgCys Gly Val Arg Lys Asp 35 40 45 Asp Arg Thr Gly Thr Gly Thr Leu Ser ValPhe Gly Met Gln Ala Arg 50 55 60 Tyr Ser Leu Arg Asp Glu Phe Pro Leu LeuThr Thr Lys Arg Val Phe 65 70 75 80 Trp Lys Gly Val Leu Glu Glu Leu LeuTrp Phe Ile Lys Gly Ser Thr 85 90 95 Asn Ala Lys Glu Leu Ser Ser Lys GlyVal Lys Ile Trp Asp Ala Asn 100 105 110 Gly Ser Arg Asp Phe Leu Asp SerLeu Gly Phe Ser Thr Arg Glu Glu 115 120 125 Gly Asp Leu Gly Pro Val TyrGly Phe Gln Trp Arg His Phe Gly Ala 130 135 140 Glu Tyr Arg Asp Met GluSer Asp Tyr Ser Gly Gln Gly Val Asp Gln 145 150 155 160 Leu Gln Arg ValIle Asp Thr Ile Lys Thr Asn Pro Asp Asp Arg Arg 165 170 175 Ile Ile MetCys Ala Trp Asn Pro Arg Asp Leu Pro Leu Met Ala Leu 180 185 190 Pro ProCys His Ala Leu Cys Gln Phe Tyr Val Val Asn Ser Glu Leu 195 200 205 SerCys Gln Leu Tyr Gln Arg Ser Gly Asp Met Gly Leu Gly Val Pro 210 215 220Phe Asn Ile Ala Ser Tyr Ala Leu Leu Thr Tyr Met Ile Ala His Ile 225 230235 240 Thr Gly Leu Lys Pro Gly Asp Phe Ile His Thr Leu Gly Asp Ala His245 250 255 Ile Tyr Leu Asn His Ile Glu Pro Leu Lys Ile Gln Leu Gln ArgGlu 260 265 270 Pro Arg Pro Phe Pro Lys Leu Arg Ile Leu Arg Lys Val GluLys Ile 275 280 285 Asp Asp Phe Lys Ala Glu Asp Phe Gln Ile Glu Gly TyrAsn Pro His 290 295 300 Pro Thr Ile Lys Met Glu Met Ala Val 305 310 143942 DNA Homo sapiens 143 atgcctgtgg ccggctcgga gctgccgcgc cggcccttgccccccgccgc acaggagcgg 60 gacgccgagc cgcgtccgcc gcacggggag ctgcagtacctggggcagat ccaacacatc 120 ctccgctgcg gcgtcaggaa ggacgaccgc acgggcaccggcaccctgtc ggtattcggc 180 atgcaggcgc gctacagcct gagagatgaa ttccctctgctgacaaccaa acgtgtgttc 240 tggaagggtg ttttggagga gttgctgtgg tttatcaagggatccacaaa tgctaaagag 300 ctgtcttcca agggagtgaa aatctgggat gccaatggatcccgagactt tttggacagc 360 ctgggattct ccaccagaga agaaggggac ttgggcccagtttatggctt ccagtggagg 420 cattttgggg cagaatacag agatatggaa tcagattattcaggacaggg agttgaccaa 480 ctgcaaagag tgattgacac catcaaaacc aaccctgacgacagaagaat catcatgtgc 540 gcttggaatc caagagatct tcctctgatg gcgctgcctccatgccatgc cctctgccag 600 ttctatgtgg tgaacagtga gctgtcctgc cagctgtaccagagatcggg agacatgggc 660 ctcggtgtgc ctttcaacat cgccagctac gccctgctcacgtacatgat tgcgcacatc 720 acgggcctga agccaggtga ctttatacac actttgggagatgcacatat ttacctgaat 780 cacatcgagc cactgaaaat tcagcttcag cgagaacccagacctttccc aaagctcagg 840 attcttcgaa aagttgagaa aattgatgac ttcaaagctgaagactttca gattgaaggg 900 tacaatccgc atccaactat taaaatggaa atggctgttt ag942 144 186 PRT Homo sapiens 144 Met Pro Val Ala Gly Ser Glu Leu Pro ArgArg Pro Leu Pro Pro Ala 1 5 10 15 Ala Gln Glu Arg Asp Ala Glu Pro ArgPro Pro His Gly Glu Leu Gln 20 25 30 Tyr Leu Gly Gln Ile Gln His Ile LeuArg Cys Gly Val Arg Lys Asp 35 40 45 Asp Arg Thr Gly Thr Gly Thr Leu SerVal Phe Gly Met Gln Ala Arg 50 55 60 Tyr Ser Leu Arg Asp Glu Phe Pro LeuLeu Thr Thr Lys Arg Val Phe 65 70 75 80 Trp Lys Gly Val Leu Glu Glu LeuLeu Trp Phe Ile Lys Gly Ser Thr 85 90 95 Asn Ala Lys Glu Leu Ser Ser LysGly Val Lys Ile Trp Asp Ala Asn 100 105 110 Gly Ser Arg Asp Phe Leu AspSer Leu Gly Phe Ser Thr Arg Glu Glu 115 120 125 Gly Asp Leu Gly Pro ValTyr Gly Phe Gln Trp Arg His Phe Gly Ala 130 135 140 Glu Tyr Arg Asp MetGlu Ser Asp Tyr Ser Gly Gln Gly Val Asp Gln 145 150 155 160 Leu Gln ArgVal Ile Asp Thr Ile Lys Thr Asn Pro Asp Asp Arg Arg 165 170 175 Ile IleMet Cys Ala Trp Asn Pro Arg Asp 180 185 145 70 PRT Homo sapiens 145 LysPro Gly Asp Phe Ile His Thr Leu Gly Asp Ala His Ile Tyr Leu 1 5 10 15Asn His Ile Glu Pro Leu Lys Ile Gln Leu Gln Arg Glu Pro Arg Pro 20 25 30Phe Pro Lys Leu Arg Ile Leu Arg Lys Val Glu Lys Ile Asp Asp Phe 35 40 45Lys Ala Glu Asp Phe Gln Ile Glu Gly Tyr Asn Pro His Pro Thr Ile 50 55 60Lys Met Glu Met Ala Val 65 70 146 18 PRT Homo sapiens 146 Leu Pro LeuMet Ala Leu Pro Pro Cys His Ala Leu Cys Gln Phe Tyr 1 5 10 15 Val Val147 25 PRT Homo sapiens 147 Met Gly Leu Gly Val Pro Phe Asn Ile Ala SerTyr Ala Leu Leu Thr 1 5 10 15 Tyr Met Ile Ala His Ile Thr Gly Leu 20 25148 14 PRT Homo sapiens 148 Asn Ser Glu Leu Ser Cys Gln Leu Tyr Gln ArgSer Gly Asp 1 5 10 149 14 PRT Homo sapiens 149 Asn Ser Glu Leu Ser CysGln Leu Tyr Gln Arg Ser Gly Asp 1 5 10 150 18 PRT Homo sapiens 150 LeuPro Leu Met Ala Leu Pro Pro Cys His Ala Leu Cys Gln Phe Tyr 1 5 10 15Val Val 151 25 PRT Homo sapiens 151 Met Gly Leu Gly Val Pro Phe Asn IleAla Ser Tyr Ala Leu Leu Thr 1 5 10 15 Tyr Met Ile Ala His Ile Thr GlyLeu 20 25 152 186 PRT Homo sapiens 152 Met Pro Val Ala Gly Ser Glu LeuPro Arg Arg Pro Leu Pro Pro Ala 1 5 10 15 Ala Gln Glu Arg Asp Ala GluPro Arg Pro Pro His Gly Glu Leu Gln 20 25 30 Tyr Leu Gly Gln Ile Gln HisIle Leu Arg Cys Gly Val Arg Lys Asp 35 40 45 Asp Arg Thr Gly Thr Gly ThrLeu Ser Val Phe Gly Met Gln Ala Arg 50 55 60 Tyr Ser Leu Arg Asp Glu PhePro Leu Leu Thr Thr Lys Arg Val Phe 65 70 75 80 Trp Lys Gly Val Leu GluGlu Leu Leu Trp Phe Ile Lys Gly Ser Thr 85 90 95 Asn Ala Lys Glu Leu SerSer Lys Gly Val Lys Ile Trp Asp Ala Asn 100 105 110 Gly Ser Arg Asp PheLeu Asp Ser Leu Gly Phe Ser Thr Arg Glu Glu 115 120 125 Gly Asp Leu GlyPro Val Tyr Gly Phe Gln Trp Arg His Phe Gly Ala 130 135 140 Glu Tyr ArgAsp Met Glu Ser Asp Tyr Ser Gly Gln Gly Val Asp Gln 145 150 155 160 LeuGln Arg Val Ile Asp Thr Ile Lys Thr Asn Pro Asp Asp Arg Arg 165 170 175Ile Ile Met Cys Ala Trp Asn Pro Arg Asp 180 185 153 70 PRT Homo sapiens153 Lys Pro Gly Asp Phe Ile His Thr Leu Gly Asp Ala His Ile Tyr Leu 1 510 15 Asn His Ile Glu Pro Leu Lys Ile Gln Leu Gln Arg Glu Pro Arg Pro 2025 30 Phe Pro Lys Leu Arg Ile Leu Arg Lys Val Glu Lys Ile Asp Asp Phe 3540 45 Lys Ala Glu Asp Phe Gln Ile Glu Gly Tyr Asn Pro His Pro Thr Ile 5055 60 Lys Met Glu Met Ala Val 65 70 154 23 DNA Homo sapiens 154gtcatgcttt tatacattct ggc 23 155 25 DNA Homo sapiens 155 ttatctgtttagatcagcac tacac 25 156 28 DNA Homo sapiens 156 gtacttgata tttatatacatcctaatc 28 157 21 DNA Homo sapiens 157 gtaatccaac actttgggag g 21 15870 PRT Homo sapiens 158 Lys Pro Gly Asp Phe Ile His Thr Leu Gly Asp AlaHis Ile Tyr Leu 1 5 10 15 Asn His Ile Glu Pro Leu Lys Ile Gln Leu GlnArg Glu Pro Arg Pro 20 25 30 Phe Pro Lys Leu Arg Ile Leu Arg Lys Val GluLys Ile Asp Asp Phe 35 40 45 Lys Ala Glu Asp Phe Gln Ile Glu Gly Tyr AsnPro His Pro Thr Ile 50 55 60 Lys Met Glu Met Ala Val 65 70 159 437 PRTH. sapiens 159 Met Lys Ile Lys Ala Glu Lys Asn Glu Gly Pro Ser Arg SerTrp Trp 1 5 10 15 Gln Leu His Trp Gly Asp Ile Ala Asn Asn Ser Gly AsnMet Lys Pro 20 25 30 Pro Leu Leu Val Phe Ile Val Cys Leu Leu Trp Leu LysAsp Ser His 35 40 45 Cys Ala Pro Thr Trp Lys Asp Lys Thr Ala Ile Ser GluAsn Leu Lys 50 55 60 Ser Phe Ser Glu Val Gly Glu Ile Asp Ala Asp Glu GluVal Lys Lys 65 70 75 80 Ala Leu Thr Gly Ile Lys Gln Met Lys Ile Met MetGlu Arg Lys Glu 85 90 95 Lys Glu His Thr Asn Leu Met Ser Thr Leu Lys LysCys Arg Glu Glu 100 105 110 Lys Gln Glu Ala Leu Lys Leu Leu Asn Glu ValGln Glu His Leu Glu 115 120 125 Glu Glu Glu Arg Leu Cys Arg Glu Ser LeuAla Asp Ser Trp Gly Glu 130 135 140 Cys Arg Ser Cys Leu Glu Asn Asn CysMet Arg Ile Tyr Thr Thr Cys 145 150 155 160 Gln Pro Ser Trp Ser Ser ValLys Asn Lys Ile Glu Arg Phe Phe Arg 165 170 175 Lys Ile Tyr Gln Phe LeuPhe Pro Phe His Glu Asp Asn Glu Lys Asp 180 185 190 Leu Pro Ile Ser GluLys Leu Ile Glu Glu Asp Ala Gln Leu Thr Gln 195 200 205 Met Glu Asp ValPhe Ser Gln Leu Thr Val Asp Val Asn Ser Leu Phe 210 215 220 Asn Arg SerPhe Asn Val Phe Arg Gln Met Gln Gln Glu Phe Asp Gln 225 230 235 240 ThrPhe Gln Ser His Phe Ile Ser Asp Thr Asp Leu Thr Glu Pro Tyr 245 250 255Phe Phe Pro Ala Phe Ser Lys Glu Pro Met Thr Lys Ala Asp Leu Glu 260 265270 Gln Cys Trp Asp Ile Pro Asn Phe Phe Gln Leu Phe Cys Asn Phe Ser 275280 285 Val Ser Ile Tyr Glu Ser Val Ser Glu Thr Ile Thr Lys Met Leu Lys290 295 300 Ala Ile Glu Asp Leu Pro Lys Gln Asp Lys Ala Pro Asp His GlyGly 305 310 315 320 Leu Ile Ser Lys Met Leu Pro Gly Gln Asp Arg Gly LeuCys Gly Glu 325 330 335 Leu Asp Gln Asn Leu Ser Arg Cys Phe Lys Phe HisGlu Lys Cys Gln 340 345 350 Lys Cys Gln Ala His Leu Ser Glu Asp Cys ProAsp Val Pro Ala Leu 355 360 365 His Thr Glu Leu Asp Glu Ala Ile Arg LeuVal Asn Val Ser Asn Gln 370 375 380 Gln Tyr Gly Gln Ile Leu Gln Met ThrArg Lys His Leu Glu Asp Thr 385 390 395 400 Ala Tyr Leu Val Glu Lys MetArg Gly Gln Phe Gly Trp Val Ser Glu 405 410 415 Leu Ala Asn Gln Ala ProGlu Thr Glu Ile Ile Phe Arg Arg Ser Asn 420 425 430 Ala Ser Tyr Ile Gln435 160 1134 DNA H. sapiens 160 cctgaaagcc tggcgccaat gacccgcgagacattttttg cctggggtgc tcctgtcgga 60 aaggaaagag gaaaggacga ctaagaactcgaactcccga atttctcttt tcaaggttta 120 agaggaaagc tggttcgtgg ggattggatgggaggccacc aggaaaccaa gttcccgcgc 180 cagcttcagt gctstcctct tcccgccgcctttgccccgc ccacatcact ttcgctccag 240 tttttgaaaa cgctgcgaag cggaatggtccacaggggaa aacggaggag gggccaaagc 300 caggactttg agaccggcgc gcggtcaagcccaggcagct ctccctaacc ctccagcact 360 gggcaaacgc tgcccgatga cgcccgcctcgggggccacg gcatcactgg ggcgactgcg 420 agcccggccg cggagccgct gggacgcggcttacctcccg gctgtcgctg ctgtgtgtgt 480 tgcccgcgcc agtcacgtcc ctaatgggaccctccgtttc ggcgtctgta aggcgaggag 540 gacgatgcgt cccctccctg gcaggattgaggttaggact aaacggggtc cgcagcgccc 600 ggcagctccc gagcgctctc cccagccgcgcctccctcct tcccgccacc cgtcccgcag 660 gggcccgcgg cgtcacctct caggctgtagcgcgcctgca tgccgaatac cgacagggtg 720 ccggtgcccg tgcggtcgtc cttcctgacgccgcagcgga ggatgtgttg gatctgcccc 780 aggtactttc aggatttcca ggtcccagatgaagagataa ttctacttac tggatatagg 840 atgcattaga tcttcttacc ttaaaaaaaaaaaaaaagca gcaatgatca aaatactaat 900 aaattactca cagactcagt gtattttttcttggagtaaa agtccaggat gggtaataga 960 atacctgctg ttggcttttg gaaaaattggtactgtgtgt agcaaaataa tgtgaaaccc 1020 atatgcatgg atattcttaa caatttgaagaaatcgtcac agctttcctg ggttgttgag 1080 cctctaagat ggtcttttcc tctgatgtgataataaagtg tttattctga actc 1134 161 50 PRT H. sapien misc_feature(45)...(45) Xaa = Ile or Leu 161 Phe Gly Trp Val Ser Glu Leu Ala Asn GlnAla Pro Glu Thr Glu Ile 1 5 10 15 Ile Phe Asn Ser Ile Gln Val Val ProArg Ile His Glu Gly Asn Ile 20 25 30 Ser Lys Gln Asp Glu Thr Met Met ThrAsp Leu Ser Xaa Pro Ser Ser 35 40 45 Asn Phe 50 162 49 PRT bovinemisc_feature (44)...(44) Xaa = Ile or Leu 162 Phe Gly Trp Val Thr GluLeu Ala Ser Gln Thr Pro Gly Ser Glu Asn 1 5 10 15 Ile Phe Ser Phe IleLys Val Val Pro Gly Val His Glu Gly Asn Phe 20 25 30 Ser Lys Gln Asp GluLys Met Ile Asp Ile Ser Xaa Pro Ser Ser Asn 35 40 45 Phe 163 51 PRTguinea pig misc_feature (46)...(46) Xaa = Ile or Leu 163 Phe Gly Trp ValLeu Glu Leu Ala Tyr Gln Ser Pro Gly Ala Glu Asp 1 5 10 15 Ile Phe AsnPro Val Lys Val Met Val Ala Leu Ser Ala His Glu Gly 20 25 30 Asn Ser SerAsp Gln Asp Asp Thr Val Val Pro Ser Ser Xaa Pro Ser 35 40 45 Ser Asn Phe50 164 49 PRT rat misc_feature (44)...(44) Xaa = Ile or Leu 164 Phe GlyTrp Val Ser Gln Leu Ala Ser His Asn Pro Val Thr Glu Asp 1 5 10 15 IlePhe Asn Ser Thr Lys Ala Val Pro Lys Ile His Gly Gly Asp Ser 20 25 30 SerLys Gln Asp Glu Ile Met Val Asp Ser Ser Xaa Pro Ser Ser Asn 35 40 45 Phe165 1767 DNA Cavia sp. 165 cttggagtca actgagtgtg gactgaaact tccaaaaactgacatgagga gtcactggag 60 aatcatgatc aaggagctac acactctgac ttaactttattctgtggaca atgagagaca 120 actgcaagga ttaacagtga gaacatgaag ctgccacttttgatgtttcc cgtgtgtctg 180 ctatggttga aagactgtca ttgtgcacct acttggaaggacaaaactgc catcagtgaa 240 aacgcgaaca gtttttctga ggctggggag atagacgtagatggagaggt gaagatagct 300 ttgattggca ttaaacagat gaaaatcatg atggaaaggagagaggaaga acacagcaaa 360 ctaatgaaaa ccttgaagaa gtgcaaagaa gaaaagcaggaggccctgaa acttatgaat 420 gaagttcatg aacacctgga ggaggaagaa agcttatgccaggtttctct ggcagattcc 480 tgggatgaat gcagggcttg cctggaaagt aactgcatgaggtttgatac cacctgccaa 540 cctgcatggt cctctgtgaa aaatatggaa aatgacagaagtggccctgt cagcaaaggg 600 gtcactgagg aagatgcgca ggtgtcacac atagagcatgtgttcagcca gctgagcgca 660 gatgtgacat ctctcttcaa cagaagcctt tacgtcttcaaacagctgcg gcgagaattt 720 gaccaggctt ttcagtcata tttcacatcg gggactgacgttacagagcc tttctttttt 780 ccatctttgt ccaaggagcc agcctacaga gcagatgctgagccaagctg ggccattccc 840 aatgtcttcc agctgctctg caacttgagt ttctcagtttatcaaagtgt cagtgaaaaa 900 ctcatcacaa ccctgcgtgc cacagaggac cctccaaaacaagacaaaga ctccaaccag 960 ggaggcccga tttcaaagat actacctgag caagacagaggctcagatgg gaaacttggc 1020 cagaatttgt ctgattgcgt taattttcgc aagagatgccagaaatgcca ggattatcta 1080 tctgatgact gccctaatgt gcctgaacta tacagagaactcaatgaggc cctccgactg 1140 gtcagtagat ccaatcagca atacgaccag gtggtgcagatgacccagta tcacctggaa 1200 gacaccacgc ttctgatgga gaagatgaga gagcagtttggctgggtttc tgaactggca 1260 taccagtccc caggagctga ggacatcttt aatccagtgaaagtaatggt agccctaagt 1320 gctcatgaag gaaattcttc tgatcaagat gacacagtggttccttcaag cctcctgcct 1380 tcctctaact tcacactcag cagccctctt gaaaagagtgctggcaacgc taacttcatt 1440 gatcacgtgg tagagaaggt tcttcagcac tttaaggagcactttaaaac ttggtaagaa 1500 gatttagtcc atcctataat cagcaagaat tacaccttcggccaagacct gagaattctg 1560 aaaatacaaa gcaggctaac acaatgaaca cagctgcatgaaagttaggt atatattagg 1620 aagcactatt ggtttacttt gttgaatgga agtttaatagctattcaaat tgagttaata 1680 taaaaatttc ttcctaaaaa gtaaaatgta catatgtagaatatgatgca ttagttcttt 1740 gtatactaaa taaatactga gtcccct 1767

What is claimed is:
 1. An isolated nucleic acid molecule comprising anucleotide sequence that encodes a HKNG1 gene product comprising: (a)the amino acid sequence of SEQ ID NO:2; (b) the amino acid sequence ofSEQ ID NO:4; (c) the amino acid sequence of SEQ ID NO:39; (d) the aminoacid sequence of SEQ ID NO:41; (e) the amino acid sequence of SEQ IDNO:43; (f) the amino acid sequence of SEQ ID NO:45; (g) the amino acidsequence of SEQ ID NO:49; or (h) the amino acid sequence of SEQ IDNO:66.
 2. The isolate nucleic acid molecule of claim 1, wherein theisolate nucleic acid molecule comprises: (a) the nucleotide sequence ofSEQ ID NO:1; (b) the nucleotide sequence of SEQ ID NO:3; (c) thenucleotide sequence of SEQ ID NO:7; (d) the nucleotide sequence of SEQID NO:34; or (e) the nucleotide sequence of SEQ ID NO:35.
 3. Theisolated nucleic acid molecule of claim 1, wherein the isolated nucleicacid molecule comprises: (a) the nucleotide sequence of SEQ ID NO:38;(b) the nucleotide sequence of SEQ ID NO:40; (c) the nucleotide sequenceof SEQ ID NO:42; or (d) the nucleotide sequence of SEQ ID NO:44.
 4. Theisolated nucleic acid molecule of claim 1, wherein the isolated nucleicacid molecule comprises: (a) the nucleotide sequence of SEQ ID NO:46;(b) the nucleotide sequence of SEQ ID NO:47; or (c) the nucleotidesequence of SEQ ID NO:48.
 5. An isolated nucleic acid moleculeconsisting of a nucleotide sequence that encodes a mature HKNG1 proteinhaving the amino acid sequence of SEQ ID NO:51.
 6. An isolated nucleicacid molecule which hybridizes to the complement of the nucleic acidmolecule of any one of claims 1-5 under highly stringent conditionscomprising washing in 0.1×SSC/0.1% SDS at 68° C.
 7. An isolated nucleicacid molecule which hybridizes to the complement of the nucleic acidmolecule of any one of claims 1-5 under stringent conditions comprisingwashing in 0.2×SSC/0.1% SDS at 50-65° C.
 8. The isolated nucleic acidmolecule of claim 6 or 7, wherein said isolated nucleic acid moleculeencodes a functionally equivalent HKNG1 gene product.
 9. A vectorcomprising the nucleotide sequence of any one of claims 1-5.
 10. Anexpression vector comprising the nucleotide sequence of any one ofclaims 1-5 operatively associated with a regulatory nucleotide sequencecontrolling the expression of the nucleotide sequence in a host cell.11. A host cell genetically engineered to contain the nucleotidesequence of any one of claims 1-5.
 12. A host cell geneticallyengineered to express the nucleotide sequence of any one of claims 1-5operatively associated with a regulatory nucleotide sequence controllingexpression of the nucleotide sequence in said host cell.
 13. An isolatedpolypeptide comprising the amino acid sequence of a HKNG1 gene producthaving: (a) the amino acid sequence of SEQ ID NO:2; (b) the amino acidsequence of SEQ ID NO:4; (c) the amino acid sequence of SEQ ID NO:39;(d) the amino acid sequence of SEQ ID NO:41; (e) the amino acid sequenceof SEQ ID NO:43; (f) the amino acid sequence of SEQ ID NO:45; or (g) theamino acid sequence of SEQ ID NO:49; (h) the amino acid sequence of SEQID NO:66.
 14. An isolated polypeptide consisting of a mature HKNG1 geneproduct having the amino acid sequence of SEQ ID NO:51.
 15. An isolatedpolypeptide comprising an amino acid sequence encoded by the isolatednucleic acid molecule of claim 6 or
 7. 16. An antibody which selectivelybinds to the HKNG1 gene product of any one of claims 13 or
 14. 17. Amethod for treating a HKNG1-mediated disorder in an individualcomprising administering to the individual a compound which modulatesthe expression of an HKNG1 gene in the individual.
 18. The method ofclaim 17, wherein the compound inhibits or potentiates the expression ofan HKNG1 gene in the individual.
 19. The method of claim 17, wherein thecompound is a small molecule.
 20. The method of claim 17, wherein theHKNG1-mediated disorder is a neuropsychiatric disorder.
 21. The methodof claim 17, wherein the neuropsychiatric disorder is bipolar affectivedisorder or schizophrenia.
 22. The method of claim 17, wherein the HKNG1gene encodes a HKNG1 gene product comprising: (a) the amino acidsequence of SEQ ID NO:2; (b) the amino acid sequence of SEQ ID NO:4; (c)the amino acid sequence of SEQ ID NO:39; (d) the amino acid sequence ofSEQ ID NO:41; (e) the amino acid sequence of SEQ ID NO:43; (f) the aminoacid sequence of SEQ ID NO:45; (g) the amino acid sequence of SEQ IDNO:49; (h) the amino acid sequence of SEQ ID NO:51; (i) the amino acidsequence of SEQ ID NO:64; or (j) the amino acid sequence of SEQ IDNO:66.
 23. The method of claim 17, wherein the individual is a mammal.24. The method of claim 23, wherein the mammal is a human.
 25. A methodfor treating a HKNG1-mediated disorder in an individual comprisingadministering to the individual a compound which modulates theexpression or activity of a HKNG1 gene product in the individual. 26.The method of claim 25, wherein the compound inhibits or potentiates theexpression or activity of a HKNG1 gene product in the individual. 27.The method of claim 25, wherein the compound is a small molecule. 28.The method of claim 25, wherein the HKNG1-mediated disorder is aneuropsychiatric disorder.
 29. The method of claim 28, wherein theneuropsychiatric disorder is bipolar affective disorder orschizophrenia.
 30. The method of claim 25, wherein the HKNG1 geneproduct comprises: (a) the amino acid sequence of SEQ ID NO:2; (b) theamino acid sequence of SEQ ID NO:4; (c) the amino acid sequence of SEQID NO:39; (d) the amino acid sequence of SEQ ID NO:41; (e) the aminoacid sequence of SEQ I) NO:43; (f) the amino acid sequence of SEQ IDNO:45; (g) the amino acid sequence of SEQ ID NO:49; (h) the amino acidsequence of SEQ ID NO:51; (i) the amino acid sequence of SEQ ID NO:64;or (j) the amino acid sequence of SEQ ID NO:66.
 31. The method of claim25, wherein the individual is a mammal.
 32. The method of claim 31,wherein the mammal is a human.
 33. A method for identifying a compoundwhich modulates expression of an HKNG1 gene comprising: (a) contacting atest compound to a cell that expresses an HKNG1 gene; (b) measuring alevel of HKNG1 gene expression in the cell; (c) comparing the level ofHKNG1 gene expression in the cell in the presence of the test compoundto a level of HKNG1 gene expression in the cell in the absence of thetest compound, wherein if the level of HKNG1 gene expression in the cellin the presence of the test compound differs from the level ofexpression of the HKNG1 gene in the cell in the absence of the testcompound, a compound that modulates expression of an HKNG1 gene isidentified.
 34. The method of claim 33, wherein the HKNG1 gene encodesan HKNG1 gene product comprising: (a) the amino acid sequence of SEQ IDNO:2; (b) the amino acid sequence of SEQ ID NO:4; (c) the amino acidsequence of SEQ ID NO:39; (d) the amino acid sequence of SEQ ID NO:41;(e) the amino acid sequence of SEQ ID NO:43; (f) the amino acid sequenceof SEQ ID NO:45; (g) the amino acid sequence of SEQ ID NO:49; (h) theamino acid sequence of SEQ ID NO:51; (i) the amino acid sequence of SEQID NO:64; or (j) the amino acid sequence of SEQ ID NO:66.
 35. The methodof claim 34, wherein the HKNG1 gene comprises: (a) the nucleotidesequence of SEQ ID NO:1; (a) the nucleotide sequence of SEQ ID NO:3; (a)the nucleotide sequence of SEQ ID NO:5; (a) the nucleotide sequence ofSEQ ID NO:6; (a) the nucleotide sequence of SEQ ID NO:34; (a) thenucleotide sequence of SEQ ID NO:35; (a) the nucleotide sequence of SEQID NO:38; (a) the nucleotide sequence of SEQ ID NO:40; (a) thenucleotide sequence of SEQ ID NO:42; (a) the nucleotide sequence of SEQID NO:44; (a) the nucleotide sequence of SEQ ID NO:46; (a) thenucleotide sequence of SEQ ID NO:47; (a) the nucleotide sequence of SEQID NO:48; or (a) the nucleotide sequence of SEQ ID NO:65.
 36. A methodfor identifying a compound which modulates expression or activity of anHKNG1 gene product comprising: (a) contacting a test compound to a cellthat expresses an HKNG1 gene product; (b) measuring a level of HKNG1gene product expression or activity in the cell; (c) comparing the levelof HKNG1 gene product expression or activity in the cell in the presenceof the test compound to a level of HKNG1 gene product expression oractivity in the cell in the absence of the test compound, wherein if thelevel of HKNG1 gene product expression or activity in the cell in thepresence of the test compound differs from the level of HKNG1 geneproduct expression or activity in the cell in the absence of the testcompound, a compound that modulates expression or activity of an HKNG1gene product is identified.
 37. The method of claim 36, wherein theHKNG1 gene product comprises: (a) the amino acid sequence of SEQ IDNO:2; (b) the amino acid sequence of SEQ ID NO:4; (c) the amino acidsequence of SEQ ID NO:39; (d) the amino acid sequence of SEQ ID NO:41;(e) the amino acid sequence of SEQ ID NO:43; (f) the amino acid sequenceof SEQ ID NO:45; (g) the amino acid sequence of SEQ ID NO:49; (h) theamino acid sequence of SEQ ID NO:51; or (i) the amino acid sequence ofSEQ ID NO:64.
 38. A method for identifying an individual having or atrisk of developing a HKNG1-mediated disorder comprising the step ofdetecting the presence or absence of a polymorphism that correlates withan HKNG1 allele associated with the disorder, wherein presence of thepolymorphism indicates that the individual has or is at risk ofdeveloping the HKNG1-mediated disorder.
 39. The method of claim 38,wherein the mutation results in production of a protein comprising anamino acid sequence that is different from the amino acid sequence ofSEQ ID NO:2 or
 4. 40. The method of claim 39, wherein the mutationresults in the substitution of a lysine for a glutamic acid at aminoacid residue 202 of SEQ ID NO:2.
 41. The method of claim 39, wherein themutation results in the substitution of a lysine for a glutamic acid atamino acid residue 184 of SEQ ID NO:4.
 42. The method of claim 36,wherein the method comprises the step of analyzing the sequence of thecoding region of the human HKNG1 gene by preparing and sequencing cDNAcomprising a sequence that hybridizes under stringent conditions to thecomplement of a nucleotide sequence which encodes the polypeptidesequence depicted in SEQ ID NO:2.